








Perspectives in 
Global Science And 
Technologv Communications 


Edited by 

N. VISWANADHAM fna 



INDIAN NATIONAL SCIENCE ACADEMY 

Bahardurshah Zafar Marg, New Delhi 



© Indian National Science Academy, New Delhi 


Editor of Publications, INSA 
Professor SK Malik, FNA 


Editorial Staff 
J. Saketharaman, AES-I 
Rajan Phull, SO-I 


Price: Rs. 360/-; US$ 120.00 


Printed and published by Shri SK Sahni, Executive Secretary, INSA and printed at Nirmal 
\^jay Printers, New Delhi. 



Foreword 


The Indian National Science Academy as the premier national body of about 650 
elected Fellows and 100 Foreign Fellows with interests in all areas of Mathematical, 
Physical, Biological, Agricultural, Medical and Engineering Sciences has promoted 
special seminars on emerging scientific developments of wide interest to many 
disciplines and which have major impact on Society. Considering the rapid advances 
in technologies in information receipt, storage, recording, retrieval, classification, 
analysis, transmission and communications, the Academy had initiated a seminar 
and discussion on this subject. It is now well established that new knowledge is 
highly valued intellectual property leading to many further advances in technology 
for human health and in economic and social development and for the preservation 
of natural resources, environment and ecology. There are opportunities for sharing 
of such knowledge and new information. There is also need for ensuring security 
of such information as well as access and utilisation by appropriate authorisation. 

These current rates of advances and consequent obsolescence in information 
and communication technologies are surpassing all previous records of new 
knowledge creation and comprehension in the entire history of human evolution 
and civilisation. They are immediately and directly affecting the daily lives of all 
people in all nations in all walks of life and there is yet more to come. There is 
tremendous excitement as well as deep concern from these developments. The 
Seminar organised by Professor N Viswanadham in response to the wishes of a 
large number of Fellows of the Academy at this critical juncture has been a 
revelation to many. I am happy the Proceedings of the Seminar have been brought 
together in the present volume entitled Perspectives in Global Science and 
Technology Communications. The Academy is grateful to all the authors and to 
Professor Nukala Viswanadham for editing the volume. Thanks are due to Professor 
Raghavendra Gadagkar for his association in the completion of the text for 
publication. 


22 July 1998 


S Varadarajan 




From The Co-ordinator 


The recent developments in computer and communication technologies have 
made possible the evolution of the network of networks, the Internet, and die 
newest information service on the internet, the world wide web. This technology- 
will have tremendous impact on Science and Technology Commmunications and 
on Education and Training. More importantly, it would also impact the industry 
competitiveness depending on the way these information technologies are 
exploited. Although it is not the primary thrust of this seminar, the technology 
would also revolutionize the entertainment and creative arts sector. In the business 
sector, several multimillion dollar products have sprung up, the most prominent 
among these include Java and Netscape. 

The availability of the World Wide Web and the easy to use Netscape have 
significant consequences on the way scientists and technologists communicate. It 
is now possible for any one to have a web site and post their manuscripts on 
publicly available FTP servers thus making moment of publication precede the 
moment of acceptance in a journal. The technology also provides for generation 
of h 5 qDertext manuscripts instead of plaintext ones. Through the internet, people 
now have access to all publicly available information. 

The emergence of Internet is certainly a technological discontinuity. It alters 
the traditional way of functioning and brings in new rules into the game of Science 
and Technology. Cooperation with peer groups abroad, by seamless use of their 
research facilities, information and data, is possible in the internet era. The world 
wide web and Internet technologies are being used for business communications 
within large corporations with advantage. In our country, the internet services are 
at a primitive level. There is a great need for all parties concerned: scientists, 
businessmen, government, regulatory agencies and Internet science providers, to 
plan and provide these facilities as early as possible. 

This publication is the Proceedings of a Seminar on “Perspectives in Global 
Science and Technology Communications" conducted at the INS A premises on April 



1, 1996. At the Seminar, we had eight speakers and all the talks were very well 
appreciated. 

In this volume, we have five full papers and two abstracts of the presentation 
at the above seminar. The four papers are on S & T Communication; A paradigm 
shift by N. Viswanadham and T.B. Rajasekhar; Digital Libraries by V. Rajaraman, 
Information Sources and Services on the Internet by T.B. Rajasekhar, Research in 
the Internet era: An example from Life Sciences by R. Gadagkar and Access 
Control Security by Satish Sukumar. 

A word of caution to the readers. The Internet and the Communication using 
it, have grown tremendously and are still growing at a phenomenally rapid rate. 
Any paper or edited volume in this area is outdated before it is printed and 
circulated. Nevertheless, this volume represents an attempt to collect all ideas 
about S & T Communications at one place and would be useful for someone who 
would want to start off doing further work or wishing to gain some general 
knowledge. 

I wish to place on record my sincere thanks to the President of INSA, Dr. S. 
Varadarajan for giving me an opportunity to organize the workshop, for his 
iiiterest in the subject, and also for his unstinted support. 


N. Viswanadham 



Contents 


Foreword 

Preface 

Global Science and Technology Communication: 

A Paradigm Shift 
N.Vishwandham and T.B. Rajashekar 

Science and Technology Information Resources on the Internet 
T.B. Rajashekar 

Research in the E-mail and Internet Era: An Example 
from the Life Sciences 
Raghavendra Gadagkar 

Digital Libraries 
V. Rajaraman 

Access Control Security 
Satish Sukumar 

National Information Infrastructure (Nil): Issues and Plans 

S. Ramakrishan 

Network Technology for Multimedia Information Dissemiation 

T. Vishwanathan 




Global Science and Technology Communication: 
A Paradigm Shift 

N. VISWANADHAM 
Department of Computer Science 
Indian Institute of Science, Bangalore-560 012 
(E-Mail: vishu@csa.iisc.emet.m) 

T.B. RAJASHEKAR 
National Centre for Science Information, 

Indian institute of Science, Bangalore-560 012 
(E-Mail: raja@ncsi.iisc.emet.in) 


1. Introduction 

Scientific communication is an integral part of teaching, development and research 
activities and is essential for the progress of both science and engineering. A 
scientist communicates for several reasons: to keep a permanent record of findings 
for later reference, to get feedback (criticism, refutation, refinement), to put claim 
on one's ideas and findings, to get recognition (reward), to avoid duplication/ 
repetition of work and to exchange information/data needed for scientific work. 
Science advances by building upon mutual criticism. Thus it is intrinsically a 
collaborative process, constructed on the base of those whose work was done 
earlier. Long ago Isaac Newton had said "I am able to look farther because I am 
standing on the shoulders of giants." 

Information transfer between science workers therefore is the very lifeblood 
of science. Successful scientific communication requires that a researcher be able 
to publish (communicate) and access research findings quickly and conveniently. 
Over the past few centuries, several forms of scientific communication have 
been evolved and adapted, with a view to improve the speed, convenience and 
efficiency of information transfer. These include informal mechanisms like 
personal contacts, correspondence, seminars, meetings, preprints, newsletters, 
etc. and formal mechanisms like books, journals, reports, theses, etc. Paper- 
based information has exploded since the industrial revolution and attempts 


1 



have been made to improve access to published literature through abstracting 
and indexing (A&I) journals. 

Change is now sweeping the whole area of science communication. The 
advent of the Internet, specifically the World Wide Web (WWW) with its hypertext 
and multimedia capabilities, graphical browsers and Web publishing tools, has 
enabled a researcher to publish, communicate and access information right from 
his desktop using a networked personal computer. This technological innovation 
has a tremendous impact on the ways and means of science and technology 
communications and has changed the rules of the game. 

In this paper, we review the revolution being brought about in science 
communication by the Internet and electronic publishing. We begin first with a 
brief sketch of paper-based science communication and the use of computers and 
electronic databases in improving access to scientific information. 

2. Science Communication in the Paper Era 

Before we consider the impact of the Internet, let us take a quick look at science 
communication prevalent for the past three centuries. Formal science 
communication has taken place predominantly on paper. Life cycle of this paper- 
based communication can be depicted as follows 

Patents 

Researcher. Conferences_ _> Journals , A&I Services 

Dissertations Books 

Technical Reports 

Patents are one of the first significant effects of the research process, particularly 
in applied research environments. Though neglected by basic science researchers, 
patents constitute a significant portion of scientific literature and are often the 
only form in which certain new information is released. Conferences form a major 
source of science commimication, reporting research work much before journals. 
Dissertations and theses embody original work, providing more substantial 
information compared to papers published in conferences and journals. Technical 
reports came to prominence after World War II, as a result of the expansion in 
government sponsored research. Technical reports typically contain interim 
research results, at much greater length and are a quicker source of disseminating 
research findings, compared to journals. Efforts have been made to compile 


2 



inventories of ongoing research, with a view to help researchers in avoiding 
duplication and to identify fellow investigators for possible collaboration. 

2.1 Books and Journals 

Before the 17th century, when the scientific revolution took hold, the book was the 
sole instrument for broadcasting new ideas, evidence and scientific theories. 
Modern scientific books are less fashionable as a means of original communication, 
given the considerable time and effort required in manuscript preparation, 
publication costs and delays. Because textbooks embody extant views of science, 
they continue to be important channels of science communication at colleges and 
universities. Authors of books depend extensively on journals, conferences, theses 
and other forms of original publications. 

However, the most important vehicle of information dissemination, dating 
from the 17th century, has been the papers published in scientific journals. Primary 
journals constitute the core collection of science libraries around the world. The 
paper in a science journal is characterized by brevity, citations to related works, 
evidence, data and theoretical support. The scientific journal was primarily devised 
to communicate research findings quickly, and thereby insure that the author's 
work establishes his priority in discovering new evidence. At the same time most 
scholarly journals attempt to maintain high quality through refereeing of articles 
before acceptance for publication. Extraordinary rate of growth of science gave 
rise to the ever growing need for more space in existing journals to report progress. 
This resulted in inordinate delays in publishing of accepted articles and the 
proliferation of journals, addressing highly specialised subject fields. 

3. Computers in Science Communication 

Until about the middle of this century, the rate of growth of literature was not too 
rapid and the manual methods of information processing and organisation were 
adequate to provide fairly comprehensive and quick access to required information. 
However, after science graduated from little science — a pursuit of the curious and 
inquisitive, to big science — with large-scale funding from governments and 
national agencies, science information has seen an exponential growth which is 
referred to in terms like information explosion and literature explosion. The 
scientist is faced with the problems of keeping pace with the ever-increasing body 


3 



of information, scatter of literature in a broad range of journals, delays in publication 
of papers resulting irt delay in getting informed about research efforts and rising 
publication costs. While the secondary publications (abstracting and indexing 
publications) offered solutions to some of these problems by providing a single 
comprehensive source or guide to published literature, the sheer amount of 
published literature or information explosion forced the information community 
to look for alternative methods of information processing in terms of automation 
and use of computers for bibliographic information. 

The emergence of the machine-readable bibliographic database in the 1960s, 
as a by-product of computer-controlled photocomposition of the abstracting & 
indexing publications, was a major step towards automated information retrieval ^ 
Off-line information services like the selective dissemination of information (SDI) 
were started using the electronic databases and by the early 1970s online databases 
became common and popular tools for retrospective literature survey. Since the 
late 1980's, a large number these databases have also been distributed on CD- 
ROMs, enabling quick access to millions of bibliographic citations and abstracts. 
Though these developments have made it easier to identify relevant literature 
retrospectively, access to current literature continues to be a problem. More 
importantly, none of these solutions have had any impact on the publishing 
process itself. 

Besides the above mentioned formal mechanisms, scientists use several informal 
channels to communicate. These can be personal contacts, correspondence and 
exchange of preprints and so on, among the scientists working in same area of 
research. It is now well-established that in any scientific community, there exists 
a personal network of professionals related through similar research interests, 
institutional ties or former associations, who keep informing each other of ongoing 
and planned research projects by sending drafts of papers for comments and 
discussing current work in correspondence or at conferences. These informal 
networks of scientists termed invisible colleges are extremely effective information 
channels, but the participation is largely restricted to those who are leaders in a 
field and it takes time for a younger scientist to get accepted in the appropriate 
group. 

The Internet and the World Wide Web have brought about radical changes in 
all the aspects of both formal and informal channels of science information flow. 
Their influence has been as significant as those of rail road, telephone, digital 


4 



computer, paper, and ink, to name a few. By connecting the PC to the Internet you 
can be a reader, accessing all public domain information or a creator, publisher and 
distributor of useful information. The Internet has now made possible location 
independent global reach. This feature has introduced a paradigm shift in science 
communication. Paradigms are a set of rules that establish boundaries and describe 
how to solve problems within these boundaries. When a shift occurs and changes 
are imminent, established procedures are violated. Every change brings with it the 
death of old ways of doing things putting in its place the unknown. For example, 
currently all scientific works are peer reviewed and printed and distributed by a 
publisher with money or brand name power. With the Internet, an individual can 
create and distribute his own works. Thus quality control becomes everybody's 
business. In the next section we introduce developments related to the Internet 
and intranets. 

4. The World Wide Web and S&T Communication 

The Internet has its origins in ARPA Net, a project of the US Department of 
Defence started in the late 1960s^. It has now grown into a global network of 
computer networks with thousands of inter linked networks and with a user-base 
of a few millions which is growing by the hour. The basic communication services 
available on the Internet are e-mail, ftp for file transfer and telnet for remote login. 
In addition, a number of information services like anonymous ftp sites, Archie, 
Gopher and the World Wide Web or WWW or simply, the Web have been developed. 

World Wide Web has revolutionised the way information is provided and the 
way it is accessed on the net*. Web is a client-server based, hj’pertext, multimedia 
information system. Information is provided on the Web servers as Web pages, 
which are simple text files with all the text marked using HTML (Hyper Text 
Mark-up Language) tags. HTML has tags for providing references to other Web 
pages, which can be on the same server or any other server on the Internet 
irrespective of the geographical location. This facilitates hypertext links across the 
documents on the Internet. Web pages can contain references to images (in GIF 
and JPEG formats), audio files (AU and WAV format) and video files (in MPEG 
format) which thus adds the multimedia dimension to the information provided 
on the Web. The clients access the Web pages on the servers renders and formats 
them according to the HTML tags for displaying on the client's system. When the 
user selects a hypertext link (indicated with an underscore) on a Web page, client 


5 



can follow the link and fetch the referred document irrespective of the location of 
the document on the Internet. The ease of setting up Web sites and HTML pages 
has resulted in a phenomenal growth of information available on Web sites on the 
Internet. Graphical browsers/clients like Mosaic and Netscape made accessing 
the information or Web browsing popular with end-users. Today web browsing 
has become synonymous with accessing information on the Internet. 

'Hypertext' is the fundamental organising concept of WWW. It refers to the 
embedding of links within an HTML page (Web page) pointing to other documents. 
An HTML page is a text file embedded with HTML mark-up tags. The linked 
documents could comprise text, images, graphs, animations and audio-video 
clips. The power of a link in the Web is that it can point to any document on 
another Internet host computer, irrespective of its physical location. This is achieved 
through uniform resource locators (URLs), the primary elements of Web 
architecture. URLs for Web access start with the string "http:/ /", followed by the 
address of a Web server and a specific page on that server. 

Latest developments on the Internet like Java and Virtual Reality Mark-up 
Language (VRML) have the potential to take the degree of interactivity and user 
interfaces to a higher level. Using Java, a new programming language for the 
network, interactive applications, termed applets, can be embedded in the Web 
pages which can be run by the browsers online. VRML, a standard for 3-dimensional 
graphics on the Web allows one to include animated 3-D graphics and 3-D 
environment besides Web links and other features. Their significance lies in their 
forthcoming integration to the existing Internet tools and the changes in network- 
based information resotirces they will enable. 

The growth in the variety of services and applications developed over the Web 
has been spectacular. Web can be used as a personal information system. It can be 
used to organise information for in-house use, at corporate/institutional level, 
and it can be a corporate or institutional Web site, meant for global consumption. 
Limiting ourselves to science and technology, some of the key sources and services 
on the Internet include the following: 

Electronic journals & newsletters Tables of contents of journals 
Science news services Preprints 

Discussion forums and News groups Technical reports 


6 



Theses and dissertations Conference proceedings 

Library catalogues Courseware and tutorials 

Factual databases Bibliographic databases 

Subject specific virtual libraries Research institutes, universities & colleges 

Personal home pages of researchers Societies, academies and associations 

Bookshops Manufacturers and suppliers 

There is increasing recent interest in the power of using the Web within an 
organisation, in the form of private intranets. Intranets are corporate/institutional 
computer networks that predominantly use Web and TCP/IP technologies for 
improving intra-office communication and access to internally generated 
information. Intranets use firewalls, a combination of hardware and softwares, to 
protect internal information being accessed by outsiders. Given the dedicated 
bandwidth and a level of trust shared by the employees, intranets enable 
information sharing at a more spontaneous and direct level than that found on the 
Internet®. A related development is the increasing availability of Web-based 
groupware packages that facilitate coworkers to collaborate on an intranet to 
improve productivity. 

Other recent developments receiving considerable attention are information 
push and agent technologies. The Internet and Web are based on the 'pull' model 
of information access, where the user proactively identifies one or more information 
sources and then access the sites to extract relevant information. The user has to 
spend considerable effort if he wishes to periodically monitor one or more sites 
for specific information, say for example, appearance of an article on a topic/by 
an author in a group of journals. In contrast to the 'pulT model, push technologies 
enable a user to subscribe to one or more channels (e.g. sports, news, weather), set 
up a profile describing his interest and also specify the frequency with which he 
should be notified, if any matching information appears on these channels. The 
push client software then keeps working in the background, scarming the specified 
channels for any matching information. The user is notified, either by e-mail or 
through the Web browser, when new information arrives. Several push services 
are available today, the pioneer being PointCast (www.pointcast.com). Agent 
technologies are more sophisticated than push technologies, in the sense that they 
can 'learn' and improve the profile based on the user's feedback and are also able 
to scan wider number of sources®. 


7 



5. Personal and Group Communication on the Internet 

E-mail is the most commonly available service on computer networks connected 
to the Internet, and also perhaps the least expensive network service. It is a store- 
and-forward messaging facility providing an effective and efficient means of one- 
to-one (personal) and one-to-many (group) communication. Availability of 
improved e-mail related protocols like SMTP (Simple Mail Transfer Protocol) and 
MIME (Multimedia Internet Mail Exchange) and GUI-based (Graphical User 
Interface) client programs like EUDORA and Pine, e-mail handling has become 
more versatile and user friendly. 

E-mail has dramatically improved collaborative research through faster 
exchange of data and research results among research groups spread across the 
world. Researchers are able to easily exchange notes and develop research papers, 
much more easily and quickly. E-mail has also increased the interaction between 
authors and readers. It is now common to see R&D groups using mailing lists, 
which contain the e-mail addresses of each member of the group, to broadcast 
messages across the group. There are instances of entire projects being handled by 
members located in different coimtries, through such mailing lists^. 

E-Mail has been used to develop a variety of innovative applications on the 
Internet. One such application is discussion forums. Also called as discussion lists 
and listservs, they enable a group of people having similar interest, to carry out 
discussions using 

e-mail. The forum software maintains a list of e-mail addresses of the forum 
subscribers, and automatically distributes any message posted to the forum, to all 
the members. Subscribing and unsubscribing to the forum is also carried out using 
e-mail. Discussion forums/lists have gained popularity on the Internet and form 
the most important informal information channels. Discussion lists are being run 
in all fields of science and technology. A researcher can find a discussion list in 
his/her topic of research, however specific it may be, and join the list to discuss 
the problems and seek solutions from his/her peers in the field. Anyone can join 
these forums and participate in the discussion via e-mail. These forums are rid of 
the disadvantages of elitism etc. The e-mail posting to these lists are archived at 
the site that runs the list, thus making the communications more formal and 
accessible by e-mail for later use. 


8 



E-Mail is also finding increased use in the delivery of tables of contents, 
newsletters, full text of papers and courseware. Several publishers and information 
service agencies operate e-mail-based content page alerting services and 
automatically deliver content pages of new journal issues, to the subscribed users. 
It is also possible now to receive results of database searches, by e-mail. E-mail is 
all pervasive, user specific and inexpensive, factors that will prompt delivery of 
more and more information over this medium. 

Usenet News is a related service that has found increased use with the 
availability of the Internet. It is a universal bulletin/conferencing system, consisting 
of several thousand news groups, each news group focusing on a particular topic. 
Newsgroups are named starting with a broad topic and then focusing on specific 
topics (e.g. 'bionet.neuroscience' deals with all aspects of neuroscience, 
'sci.psychology' on all aspects of psychology). A Usenet news group is functionally 
similar to e-mail discussion forums, but they differ in the way messages are 
distributed and accessed. Usenet news is distributed at host level, whereas 
discussion forums distribute messages at user level. Thus, Usenet news does not 
get stored in e-mail boxes, so no local storage is required, except for those saved 
while scanning news on the host. Each Usenet host collects news from the 
neighboring host. Volume of messages on Usenet is very large, over 100 megabytes 
per day! Variety of software is available for reading and posting messages to news 
groups. Web browsers now support news reading. Archives of newsgroups can 
now be searched on the Web. Just like e-mail-based discussion forums, Usenet 
news groups constitute a very important source of informal communication among 
scholars. 

6. Electronic Publishing 

Computer-based publishing through the use of sophisticated editing software and 
printing technologies has been in use for several years now. However, the Internet, 
by networking authors, editors, reviewers and the users, has brought significant 
improvements to the entire publishing process and subsequent access to the 
published information. Publishers have begun to host electronic versions of 
publications on Internet servers and provide access to end users from their desktop 
computer®. 

The most visible impact of Internet-based publishing has taken place in the 
area of scholarly journals, which are appearing in the form of electronic journals 


9 



(or simply, 

e-joumals) on the Internet. The nature of the Internet has enabled editorial process 
to be more interactive and instantaneous. Reviewers thousands of miles apart are 
able to review articles nearly in real time with other editors, allowing for the rapid 
development of comments and opinions. In turn, authors enjoy rapid notification 
of the editorial decision of their submissions. Digital proofs of articles for review 
by authors (and editors) usually arrive couple of weeks in advance on the publishers' 
server. This provides the authors and editors ample opportunities for last-second 
revisions. The relative quick publication of electronic journals, compared to print, 
makes them attractive to a diverse audience, from contributors to editors to 
readers. Equally important is the browsing made possible by the digital archives 
of the electronic journaF-®. 

Electronic scholarly journals thus differ from printed scholarly journals by 
accelerated peer review, combined with mercurial production schemes, allowing 
for creation of proofs and final versions very quickly. The sheer interactive nature 
of digital journals, providing ample opportunities for peers to critically analyse 
articles, and the ability to access the complete archives of a given title on a server 
makes this sort of publishing a significant departure from the long established 
traditions in print. The e-joumal is thus being considered a viable alternative to 
the printed journal to cut down the publication delays as well as costs. There has 
been a proliferation of e-joumal providers on the Internet. Major publishers of 
science journals like Elsevier, Academic Press and Blackwell Scientific have 
announced their commitment to provide Web access to their journals. They have 
been joined by professional societies and associations like the American Chemical 
Society, IEEE, lEE and the American Association for the Advancement of Science 
(AAAS). With payment of appropriate license fee, most of the journals from these 
publishers can be accessed over the Internet today. 

Various other forms of traditionally print oriented science and technology 
publications have made their appearance on the Internet. These include preprints 
(e.g. the American Mathematics Society preprints service), technical reports (e.g. 
NCSTRL - Networked Computer Science Technical Reports Library®, library 
catalogues (e.g. Library of Congress'® and the University of California library 
catalogues") and patents (e.g.the free access to US patents provided by the IBM 
Patent Server'^). An Electronic Thesis and Dissertations (ETD) project has recently 
been launched in the U.S., with participation by universities around the world’l 


10 



Campus Wide Information Services (CWIS) of universities and research 
institutes have become a major channel of science and technology communication. 
These typically include an institutional Web site and several department or division 
Web sites, operating over an intranet. These provide access to a storehouse of 
information like the preprint/technical report collections of any faculty/researcher 
or the details of ongoing research projects and so on. 

A major impetus for publishing on the Internet has come about thanks to the 
initiatives taken in digital library research in U.S.A., and several other countries'^- 
Digital libraries provide access to very large collections of distributed, full text, 
multimedia databases. Examples include the ACM digital library'® and the IEEE 
Computer Society digital library'^. 

7. Infrastructural Needs 

Global science and technology communications require vast resources particularly 
of computers and communications: computer networks linking laboratories and 
universities to the regional networks, the regional networks to the national networks 
and the national ones to the global network. High speed, high bandwidth network 
is the basic infrastructure which needs to be timeshared with other personal, 
business, recreational, commercial and entertainment applications. In some sense 
overall growth in all sectors is essential for establishing and maintaining Web 
communications. 

Development and deployment of robust information infrastructures at national 
and regional levels and their interconnection holds the key to the development of 
future information networks. The Global Information Infrastructure (GII), National 
Information Infrastructure (Nil) of individual countries, and Regional Information 
Infrastructures (RII), are expected to serve as major components in building an 
intelligent society for the 21st century. GII will provide open access and universal 
availability of services, by securing interconnectivity and interoperability between 
Nils and RIIs at a global level. At the core of Nil is a very high speed backbone 
network spread across the country, with international connectivity. The backbone 
will carry multimedia digital information at high speeds from one comer of the 
country to another or send it to an outside location via international gateways. The 
backbone is built using fibre optic cables, satellite and radio links, high speed 
routers, hubs and switches. It will interconnect all smaller networks within the 
country and provide seamless network access to individuals and institutions'®'^® 


11 



In a country where an Nil is in place, the citizens will be able to tap into the 
vast reservoirs of electronically stored information to improve their businesses, 
and to make their working, personal and social lives better. It will link computers 
and other information appliances in homes, offices, schools and factories across 
the country, m facilitate global access. The Nil will serve as the infrastructure upon 
which new and enhanced services, ranging from distance learning to ticket ordering 
to video-on-demand, will be built and rendered. The easy flow of information will 
integrate business to goverrunent, vendors to customers, and all information 
seekers to information providers. Decision making will thus become quicker and 
easier. Whether it is a government authority issuing a license or a housewife 
buying a pair of shoes, all interactions and transactions will be possible on the 
information network and there will be less need to travel. 

8. Conclusion 

In this paper, we have surveyed the history of science communication and have 
clearly stated the new mode of Internet-based science communication. We have 
emphasized the changes that have taken place in the publishing policy and the 
entire publishing business due to the emergence of the World Wide Web. 

In our coimtry there is large scale awareness of the potential of the Internet 
with leaders from all walks of life: politicians, businessmen, technocrats, 
bureaucrats, all talking about it. The fact remains however that the IT infrastructure 
in our country is in poor shape. While 1% of them are able to communicate with 
US information sources, there is very little by way of our own sources of 
information’^'^. Indeed even the science and engineering academics of this country 
do not have their home pages on the Web. Only the institutes of higher learning 
probably have their faculty listing on the Web. It is important that all scientists in 
India form Indian (local) discussion forums, create Web pages for themselves and 
their organisations basically with a view to catch up with the rapidly changing 
science publishing mechanisms. Otherwise Indian science would be left too far 
behind. 

We thus have five recommendations to make:- 

1) Intercoimect all S&T establishments and academic institutions with a high 
speed network. 


12 



2) Create the Web infrastructure in all R&D and teaching institutions by 
creating a separate budget. 

3) Create digital libraries or information cafes (kiosks) in all institutions and 
public libraries. 

4) Motivate all R&D and science and technology institutions to place their 
research information on the Web. 

5) Academics to take a leading role in proliferating information exchanges 
among creators and users of science by creating E-Letters (electronic 
newsletters) in the first place and E-Joumals later. 

References 

1. Wasserman, Paul. The inforrna tion communication and transfer process in science and technology 
(1989). 

2. Rajashekar, T.B. and A. Srinivasa Ravi. Electronic databases, networks and information support 
for scientific research. Curr. ScL, 10 February (1994), 66(3), p. 199-212. 

3. Meleis, Hanafay. Toward the information network. IEEE Computer, October, 1996, 
p. 59-67. 

4. Berners-Lee, Tim WWW: Past, present and future. IEEE Comp., October, 1996, p. 69-77. 

5. Electronic Journal: Why? /. electron. ptibL, 3(1) 1997, http://www.press.umich.edu/jep/03-01. 

5. Sullivan, Michael. Chemistry on the Web: A new approach to chemical discovery. Today's Chemist 
at Work, 6(6) 1997, 13, p. 15-16. 

6. Global Information Infrastructure Commission (GIIC). Electronic Publishing. 
http: / / www.gii.org /egi00188.html 

7. IBM intelligent agents home page. 

http: / / www.networking.ibm.com/iag/iaghome.html 

8. Valauskas, Edward J. First Monday and the evolution of electronic journals. /. electron, publ. 3(1) 
1997 http://www.press.umich.edu/jep/03-01 /FirstMonday.html. 

9. Networked Computer Science and Technical Reports Library (NCSTRL). 
http: / /www.ncstrl.org/ 

10. Library of Congress, http://www.loc.gov/ 

11. Melvy 1-University of California Library Catalogue, http://www.melvyl.ucop.edu 

12. IBM patent server, http://www.patents.ibm.com/ibm.html 

13. Jayaram, Anup and others. Is India ready for the net? Business World, 22 August, 1997, p. 28- 
33. 


13 



13. Networked Digital Library of Theses and Dissertations (NDLTD). 
http: / / www.ndltd .org/ 

14. Digital library information and resources. 
http://interspace.grainger.uiuc.edu/--bgross/digital-libraries.html 

15. D-Lik magazine (electronic journal devoted to digital library developments), 
http: / / www.dlib.org 

16. ACM Digital Library, http://www.acm.org/dl/ 

17. IEEE Computer Society Digital Library.http://www.computer.org/epub/ 

18. Hebbar, Prashant. Indianet - Netting India. Voice & Data, March 1996. 

18. PointCast, http://www.poincast.com/ 

19. http://www.data.com/Global--Networks/India-Inc.html 

20. Smith, Jean E. and Fred W. Weingarten (Eds). Research challenges for the next generation Internet. 
Workshop convened by Computing Research Association, May 12-14, 1997, Vienna, Virginia, 
http://www.cra.org/main/research_chall.pdf 

21. Subramanyam, L. and Ahmad, Ibrahim. Nil: The new agenda. Dataquest, 1-15 January, 1996, p. 
76-86. 


14 



Science and Technology Information 
Resources on the Internet 

T.B. RAJASHEKAR 

National Centre for Science Information (NCSI), 

Indian Institute of Science, Bangalore-560 012 
(E-Mail: raja@ncsi.iisc.ernet.in), (URL : http://ivzow.ncsi.iisc.ernet.in/~raja) 

1. Introduction 

Since the advent of modem science, attempts have been made to improve the 
speed and efficiency of scientific communication. Scholars have evolved various 
formal and non-formal mechanisms including personal communication networks 
('invisible colleges') towards this end. Most of the scholarly information however, 
has continued to be published in print, i.e., in journals, books, conferences, etc. 
Print-oriented publication has several advantages—media familiarity, usage 
convenience, personal recognition bestowed upon the author and the in-built peer 
review process. It has serious flaws too—^publication delays, distribution costs and 
access time-lag. Electronic access to scholarly information was till recently limited 
to secondary information in the form of bibliographic databases, citation indices 
and abstracting journals available online and on CD-ROMs. Libraries around the 
world have also developed varieties of Online Public Access Catalogues (OPACs), 
limited once again to bibliographic details. While identification of primary literature 
has become easier, gaining access to the full text of the required primary publications 
continues to be a major problem. More importantly, these developments have had 
no impact on the publication process itself. 

2. Internet and the World Wide Web 

The emergence of the Internet is radically changing the generation, flow and 
utilisation of scholarly information globally. Internet has its roots in the ARPANET 

Substantially revised version of a talk given in the seminar on 'Perspectives in Global Science and 
Technology Communications' held on April 2,1996 at'the Indian National Science Academy, New 
Delhi. 


15 



project of the Department of Defence, U.S.A. in the late 1960's. Today Internet 
jj^^grconnects thousands of computer networks and millions of individual 
computers across the world using TCP/IP as the computer communication protocol. 
Starting with basic network services like E-Mail (massaging), FTP (File Transfer 
Protocol) and Telnet (remote login), the Internet has made quick progress with the 
development of tools like Gopher, WAIS (Wide Area Information Server) and the 
World Wide Web (WWW). The WWW, or simply the Web, is the most popular and 
rapidly growing service on the Internet today. 

WWW is an application system implemented on computers connected to the 
Internet, enabling multiple computers with disparate operating systems to 
communicate using Hyper Text Markup Language (HTML) as their lingua franca. 
WWW has made the Internet easier to use and transmit these over the Internet 
using Hyper Text Transmission Protocol (HTTP), which operates on TCP/IP. Web 
browsers are user interface software packages which receive and display Web 
pages from the servers. Web browsers are available for numerous operating systems 
including Windows, OS/2, MacOS, Unix, etc. Two very popular browsers are 
Netscape and the Internet Explorer. In addition to HTTP, WWW supports other 
access tools like E-Mail, FTP, Gopher and Telnet, making it an all-in-one Internet 
user interface. 

'Hypertext' is the fundamental organising concept of WWW. It refers to the 
embedding of links within an FTTML page (Web page) pointing to other documents. 
The linked documents could comprise text, images, graphs and charts, and audio- 
video clips. These documents need not exist in just one computer. They may be 
distributed among different Web servers located at different geographic locations. 
HTML supports sophisticated information display capabilities and provides wide 
latitude to the information providers to control the page display format. Marking 
up a document using HTML tags is fairly simple. Someone who already knows 
how to use a word processor can learn the basics of HTML within a few hours. 

The growth in the variety of services and applications developed over the Web 
has been spectacular. Three key Web services that need to be mentioned here are 
: forms-based query processing, VRML (Virtual Reality Modelling Language) and 
Java. WWW supports the search and retrieval of information from databases 
existing on Web servers using HTML-based query forms. VRML has made it 
possible for viewing and manipulating 2 D and 3 D graphics. Java is emerging as 


16 



a very important high-level programming language for the WWW. Programs 
written in Java can be automatically transferred to the browser, where they are 
executed by an embedded Java interpreter. 

3. Network Information Revolution 

The relative ease with which Web-related tools may be used to publish and access 
multimedia information over the Internet has led to the availability of a variety 
of digital information sources on the Internet. 'Network information', is the term 
often used to represent this emerging new information world. Hypertext-linked 
multimedia documents on widely distributed Web servers are literally enveloping 
the earth with mformation. 


The emerging digital network information world is depicted in Fig. 1. It shows 
desk-top computers (PCs) connected to the Internet via institutional local area 



17 




networks and network service providers (e.g. Internet service providers like ERNET 
and the Videsh Sanchar Nigam Limited). A variety of digital information will be 
mounted on Web server computers. Using Web browser programs like Netscape 
and Internet Explorer on their desk-tops, users will be able to seamlessly connect 
to and extract information from local as well as geographically dispersed Web 
servers. 

4. Science and Technology Information Resources on the Internet 

Most of the initial research and development leading to the emergence of the 
Internet was done at universities and research centres. In fact WWW itself was the 
result of efforts made by theoretical physicists at CERN, Switzerland, to improve 
the usability of their publications by exploiting the hypertext linking concept. E- 
Mail, which is used extensively on the Internet, has dramatically improved personal 
communication and collaborative research. E-Mail-based discussion forums have 
enabled faster group communication across national boundaries. Development of 
E-Mail based preprints registration and routing systems in the area of high energy 
physics and mathematics in the early days of the Internet successfully demonstrated 
the possibility of developing elegant, effective and inexpensive solutions to 
problems inherent to print publications. Anonymous FTP services and Telnet have 
been used to provide access to research papers, reports and scientific databases. 
With the availability of the WWW technology, there has been phenomenal growth 
in the number of Web sites providing access to a variety of S&T resources. Publishers 
like Elsevier, science societies like the American Mathematical Society and 
professional organisations like the IEEE and lEE have set up their own Web sites 
to deliver a variety of scientific information, including journals. 

Some of the key S&T resources available today on the Internet include the 
following 

- Preprint 

- Discussion forums 

- Electronic journals 

- Tables of Contents (journals) 

- Technical Reports 

- Library catalogues 

- Campus Wide Information Services (CWIS) 

- Scientific data sets 


18 



- Patents and Standards 

- Directories of Science and Engineering institutions, associations and societies 

- Reference sources 

- Courseware and distance education 

- S&T resource catalogues (virtual libraries) 

In the following subsections we briefly discuss some of these resources. An 
illustrative list of several specific Web-accessible S&T resources, along with their 
site address, is given in Appendix -1. 

4.1 Electronic Journals 

Popularly known as e-journals, these represent a major growth area on the 
Internet. Major publishers of science journals like Elsevier'- Academic Press^ and 
Blackwell Scientific'* have announced their commitment to provide Web access to 
their journals. There are two types of e-joumals : electronic versions of print 
journals and serials published and accessed on networks with no print counterpart. 
Some e-joumals have only text content, but the trend is towards Web access to 
both text and images, including 3D and 3D graphics using VRML. 

Electronic journals offer several benefits. Users gain quick assess to current 
and archival issues. Availability is very high compared to print issues. Users also 
have the choice of downloading only the desired articles. Ke)rword-based search 
facilitates fast and easy identification of articles. The hypertext feature used by 
many e-journals helps readers trace a reference quickly and gain immediate access 
to the full article. They can be produced faster and more economically than printed 
journals. Since they need not be bound by physical size or number of articles, 
backlogs and delays are naturally eliminated. Papers may be published as soon as 
they are accepted by the referees. 

Qucik access to e-journals, however, requires good network speeds. If users 
like print copies for extended reading, they need easy access to printers with good 
graphics capabilities. More importantly, licensing and authorization restrictions 
may prove a major hindrance to casual users/browsers. 

4.2 Tables of Contents of Journals 

Major publishers of science journals today deliver content pages of their 
journals by E-Mail, mostly free of cost. 'Contents Direct' service by Elsevier^, for 


19 



example, delivers content pages of over 800 journals. Free web access to content 
pages is also provided by most publishers today. Generally, the content page data 
are available much earlier to the full journal. Many document delivery agencies 
also provide access to content pages. A very popular service is the Uncover service 
provided by the CARL agency, providing free access to content pages of over 
16,000 journals. Institute for Scientific Information (ISI), U.S.A., has recently initiated 
a content page alerting service®. 

4.3 Preprints 

Preprints were used by scholars as a means to enhance the speed and efficiency 
of scientific communication. Preprints were among the earliest to be delivered 
electronically over the Internet, first by E-Mail and later over the World Wide Web. 
Internet-based preprint service has been highly successful in High Energy Physics 
and Mathematics. For example the American Mathematical Society preprint server 
provides Web access to recent 100 preprints submitted to the server, offers field 
based search of the preprint archives, and facilitates forms-based submission of 
new preprints. 

4.4 Discussion Forums, and Usenet News 

Discussion forums, also called mailing lists, discussion lists and listservers, 
are a major network resource that serve the purpose of current awareness. They 
use E-Mail to set up informal discussion among people of specific research interests 
via the internet. Forum software (e.g. Listserv) maintains a list of E-Mail addresses 
of all subscribers/ members and the messages posted to the forum are distributed 
automatically to all subscribers. Joining (subscribing) and leaving (signing off) the 
forum and posting messages are carried out through E-Mail. Participating in 
discussion fcarums does not require dedicated Internet cormectivity which can be 
very expensive. Most forums archive the messages and allow searching and 
extraction of earlier discussions. Forums may be moderated or un-moderated. On 
moderated forums, the messages posted to the forum for distribution are screened 
by a human expert, before they are distributed. 

Participation in discussion forums has several advantages. Forums are not 
boimd by geographic distmices. They help the participant to keep up-to-date with 
current developments in a field, which are often not reported fast enough in print 


20 



media. One can pose professional questions and seek solutions and participate in 
discussions on specific topics in a short period of time. Forums are a great place 
to identify peer workers. It may also be mentioned here that a good number of 
messages received from forums may be irrelevant. Users also may have to contend 
with the advertising, canvassing and commercial uses of discussion forums. 

Unlike discussion forums, Usenet News in not delivered to the user's E-Mail 
box. Instead, News reading software is used to access News from the nearest 
Usenet News feed computer. News (and E-Mail) reading facility is now supported 
by Web browsers like Netscape. Usenet News groups are hierarchically structured, 
for e.g., sci.physics.fusion is the news group for discussions related to nuclear 
fusion. Since the number of messages posted every day to these groups is very 
large, setting up of News feed sites is quite expensive requiring high band-width 
cormectivity and powerful News server computers with large disc storage. For 
this reason we do not have many News sites in India. However, archives of most 
of the News groups can be searched using Web search tools like Altavista (discussed 
in more detail below). 

4.5 Technical Reports 

Technical reports provide more details of ongoing or completed R&D projects 
than can be obtained from papers published in journals emd conference proceedings. 
Individual departments in research institutes and universities are good sources of 
technical reports. Thanks to the Internet, a large number of these reports can be 
easily accessed, often free of cost. Web servers of universities and research institutes 
often point to such reports and publications. For example, a well-developed 
system called NCSTRL (Networked Computer Science Technical Reports Library) 
now exists for world wide access to technical reports in Computer Science research. 

4.6 Library Catalogues 

A large number of library catalogues can be accessed online via the Internet. 
Generally available free of cost, these are useful for finding books not available 
locally, identifying and selecting books for local acquisition, bibliographic 
verification, and for searching holdings of periodicals. Most do not need any login, 
some accept 'guest' logins and a few require authorisation. Access modes include 


21 



Telnet, Gopher and the WWW. While there is a large variation in the Telnet-based 
search interfaces across different OPACs, their on-going transition to the Web will 
ensure a common search interface in the near future. Catalogues of very large 
libraries like the University of California ('Melvyl') and the Library of Congress 
('LOCIS'), are examples of Internet-accessible library catalogues. 

4.7 Campus Wide Information Services (CWIS) 

These are online information services of universities providing Web-based 
access to a variety of information. In addition to research literature, these provide 
access to faculty and student directories, course details and research projects, 
campus computing, library catalogues and other databases, admission regulations 
and policies, placement information, campus phone directories, etc. 

5. Finding and Keeping Up-to-Date with Internet Resources 

Looking for a specific piece of information on the Internet is quite like searching 
or a needle in a hay stack. With thousands of sites out there, how do we know 
which site contains the information we are looking for? Search tools, resource 
directories and current awareness services offer some solution. A useful list of 
such resources is given in Appendix-II. 

Search tools use spider programs (robots) which periodically visit Web sites 
aroxmd the world, gather the Web pages, index these pages and build a database 
of information given in these pages. They provide a forms-based search interface 
for the user to enter a query consisting of one or more keywords and their 
combinations (e.g. 'plastic' 'plastic waste' or plastic waste and recycling'). The search 
results are returned in the form of a list of Web sites matching the query. Using 
embedded h 5 q)ertext links one can then connect to each of these sites. There are 
very powerful Web search tools today with varying indexing and search capabilities. 
Alta Vista^ developed by the Digital Equipment Corporation is perhaps the most 
powerful search service, allowing full-text searching of over 30 million Web pages 
on over 2, 75,000 Web services. Its rearch capabilities include truncation. Boolean, 
proximity, parentheses and specific field (title, URL, host, links) searching. There 
are also search services which send queries to several individual search systems. 
Metacrawler^, for example, send queries to nine individual services simultaneously 
: Alta Vista, Ecite, Galaxy, InfoSeek, Inktomi, lycos. Open Text, WebCrawler and 
Yahoo. Results are organised into a uniform format with duplications removed. 


22 



Internet resource directories, which are also called 'meta sources', 'virtual 
libraries' and 'resource guides', catalogue Internet resources and provide hypertext 
links to these sites. They are very useful for resource identification and navigation. 
These catalogues categorise resources by subject and/or resource t 5 ^e. Broadly, 
there are two types of resource directories—Omnibus and subject/ resource specific. 
Omnibus directories attempt to cover several areas. Examples include yahoo, 
EINET Galaxy, WWW Virtual Library and Planet Earth. Subject or resource-specific 
directories are usually maintained by science and engineering societies and 
organisations (e.g., American Mathematical Society and American Chemical 
Society), department or libraries in universities and research institutes. 

A few resources on the Internet serve the purpose of current awareness 
reporting new sites. These include mailing lists (e.g. Newjour which announces 
new journals available on internet and newsletters (e.g. Scout Report, a weekly 
publication describing new resources). Most of these are available freely and can 
be obtained using E-Mail. 

6. Trends in Internet-based Scholarly Information 

With intranets (use of Internet technology within institutional and corporate 
networks) catching up with university campuses, we are beginning to see Web 
sites being set up at department levels. Academics have taken actively to publishing 
on these sites and experimenting with new technologies (e.g. special mark up 
language for 3D manipulation of chemical structures). Several digital library 
projects® have been launched around the world. Publishers, universities and research 
institutes and laboratories are co-operating in many of these projects to create 
large digital collections. Elsevier, in association with seven U.S. universities, has 
recently concluded TULIP (The University Licensing Project), which aims at 
providing desktop access to primary journals on campus networks. An Electronic 
Thesis and Dissertations (ETD) Project' has recently been launched in the U.S., 
with participation by universities around the world. Secondary database publishers 
like the Institute for Scientific Information (ISI)'^ and CXILC'^ are expanding their 
roles to electronic delivery of primary publications. 

Traditional online database vendors like the Knight Ridder'^ and STN 
International are now providing Web interface to their databases. Silver Platter, a 
major CD-ROM database publisher, now provides WWW gateway to CD-ROM 


23 



databases mounted on Unix servers using its ERL (Electronic Reference Library) 
software. Cambridge Scientific Abstracts, another major publisher of science 
databases, was among the first to provide Web access to its databases. ISI has 
recently announced an Intranet solution to the Science Citation Index^®. 

Internet-based information access is not without its problems. One problem 
has to do with the uncertainty about the quality of the information and durability 
of the Internet sites. Given the ease and speed with which information can be 
published on the Internet, the quality of many free sites normally becomes suspect. 
Thankfully, a solution to this is emerging in the form of rating services like Point 
and Magellan (see Appendix-II). Directories that evaluate resources included in 
their catalogues also serve a quality control function. Many high quality free 
Internet sites, particularly those maintained by individuals, disappear or shift 
their location, depending on the interests and movement of the individual. Another 
serious problem is the poor response time. With the increasing use of Internet for 
business and commerce and delivery of multimedia files, network pipes are 
getting choked, causing large delays in information transmission. There are moves 
to create Internet 11, a very high speed network for the exclusive use of the 
scientific community^*. 

Another problem relates to the copyright of network resources. Many believe 
that the publishers, who have control over most of the high quality print information 
resources (e.g. journals), would not only like to retain the control in the network 
environment, but perhaps exploit it further to their advantage using electronic 
technology to restrict access to information’^. It is believed that the publishing 
lobby is behind the recent move to get a new intellectual property treaty ratified 
by the WIPO, further restricting the 'fair use' of copy righted and public domain 
material for research and education piurposes’*. 

7. Internet and Research : The Indian Situation 

Education and Research Network (ERNET)'® has been quite successful in creating 
awareness of the Internet among the higher education and research community 
in the coimtry. So far, the use of Internet has been limited to E-Mail exchange and 
for accessing external information using tools like FTP, Telnet and the WWW. Only 
a few universities and educational institutions (e.g. HTs, IISc, Punjabi Univ.), 
research institutes (e.g. lUCCA, National Chemical Laboratory) and government 
S&T departments (e.g. Dept, of Electronics) have their own Web sites. The reasons 


24 



are many. There are only three Internet service providers in the country today - 
ERNET, VSNL^“ and NIG^’. The cost of setting up a Web site with a reasonable 
bandwidth of 64 KbPS is very high. Besides, the telecommunication tariff in India 
is among the highest in the world. Such high costs discourage developments 
related to the setting up of Campus Wide Information Services, Web-accessible 
databases (e.g. library catalogues, theses), publication and resource directories and 
discussion forums. 

The National Centre for Science Information (NCSI)“ at the Indian Institute 
of Science, Bangalore has made a modest begirming in this direction by setting up 
a structured catalogue of key Internet sites in Science an Engineering and providing 
access to a few databases, including a union catalogue of journals held in the five 
IITs and IISc. The Centre was among the first to set up and operate a discussion 
forum^'^ (LIS-FORUM) for providers and users of library and information services 
in the country. 

8. Conclusion 

This paper presents only an indicative overview of the developments taking place 
in the generation and dissemination of science an technology information on the 
Internet. Many believe that Internet is a new paradigm in global communication 
and information flow, equal in significance to those associated with the invention 
of paper, the making of the printing press, and the emergence of the computer. 
Internet is collapsing national boundaries, albeit in a virtual environment, and 
bringing together the scientific community in a way that has begun to 
fundamentally alter the way advanced research and higher education are carried 
out. 

Acknowledgements 

My sincere thanks to Professor Thomas Chacko, Foreign Language Section, Indian 
Institute of Science, Bangalore for his help during the preparation of this paper. 

Reference 

1. Elsevier : http://www.elsevier.nl/ 

2. Academic Press journals : http://www.idealibrary.com/ 


25 



3. Blackwell Scientific journals : http://www.blacksci.co.uk/uk/journals.htm 

4. Contents Direct service from Elsevier : http://www.elsevier.nl/ 

5. ISrs table-of-contents alerting service Qoumal Tracker) : http://www.isinet,com/jtrack 

6. Altavista : http://altavista.digital.com/ 

7. Metacrawler : http;//metacrawler.cs.washingotn.edu:8080 

8. Chemical markup language for 2D and 3D chemical structure handling on the Web : http:// 
www.venus.co.uk/omf / 

9. Digital library projects : http://sunsite.berkeley edu/ 

10- The University Licensing Project (TULIP) of Elsevier : http:///www.elsevier.nl/ 

11. Electronic Thesis and dissertations project: http://etd.vt.edu/etd/ 

12. ISI-IBM electronic library project: http://www.isinet.com/ 

13. Electronic journals from OCLC : http://www.oclc.org/menu/ejo.htm 

14. Science Web from KnightRidder : http://www.krinfo.com/ 

15. Intranet access to citation index databases from ISI : http://www.isinet.com/ 

16. Internet II: http://www.internet2.edu/ and Very high speed Backbone Network for Internet 
n : http://www.vbns.eu/ 

17. Ken Rouse. The serials crisis in the age of electronic access. Newsletter on Serials Pricing, No. 
177, May 1 1997 http://sunsite.uc.edu/reference/prices/1997/PRIC177.HTML). (To subscribe 
to this newsletter, send the E-Mail message "subscribe prices your name" to listporc@unc.edu). 

18. James Love. A primer on the proposed WIPO treaty on database extraction rights that will be 
considered in December 1996 (http://www.essential.org/cpt/ip/cpt-dbcom.html) 

19. ERNET : http://www.ece.iisc.emet.in/ 

20- VSNL : http://www.vsnl.net.in/ 

21. NIC : http://www.nic.in/ 

22. NCSI : http://www.ncsi.iisc.ernetin/ 

23. LIS-FORUM. To subscribe to this forum send the E-Mail message "subscribe lis-forum your 
name" to lislserv@ncsi.iisc.emet.in 


26 



Appendix I 

S&T Information Resources on the Internet: Some Examples 

These are examples of specific S&T resources. Resource guides and finding aids 
are listed in Appendix-II. 


Information Source Type 
Preprints 


Technical Reports 


Theses and Dissertations 


Examples 

American Mathematical Society preprint 
server 

http:/ / www.ams.org/preprints/ 

LANL preprint server 
http://xxx.lanl.gov 

Covers 11 areas including high energy 
physics, nonlinear sciences, computation and 
language, etc. Mirrored in about 10 coimtries. 

NASA Technical Reports 

http: / / techreports.larc.nasa .gov 

STAR Scientific and Aerospace Reports File 
(http://www.sti.nasa.gov/rselect/star.html) 

Networked Computer Science Technical Reports 
Library (NCSTRL) http://www.ncstrl.org/ 
International collection of computer science 
technical reports from computer science 
departments and industrial and Govt, 
research labs. 

Electronic thesis and dissertation project 
(ETD), Virginia Tech university, USA. 
http://etd.vt.edu/etd/ 

UMTs Dissertations Explorer. 

http: / /www.umi.com/hp/support/ 

dexplorer 


27 



Recent three months's dissertations available 
free. 

Discussion Forums GENTALK - Forum for discussion of genetic 

problems, lab protocols, current issues 
dealing on genetics & genetic engineering in 
general. 

Subscription to: 'listserv@usa.net' 

HUM-MOLG - Discussions, non-commercial 
ads, armouncements and questions related 
to the field of human molecular genetics. 
Subscription to: 'listserv@nic.surfnet.nl' 

Content Pages of Periodicals Uncover 

http://uncweb.carl.org 
Uncover provides free access to content 
pages of over 16,000journals. Supports forms 
based search interface. 

ESTOC - Elsevier Science Tables of Contents 
Service 

http: / / www.elsevier.nl/estoc/ 

Provides access to the tables of contents of 
approximately 900 Elsevier Science primary 
and review journals. 

This service is available by E-Mail also, to 
registered users. 

Science Journals Physics Express Letters 

http://www.iop.org/PEL 

Free access to abstracts and full text of 12 

Institute of Physics Publishing Journals. 

Science 

http: / / www.sciencemag.org/ 


28 



Conferences and meetings 


Library Catalogues 


Patents 


Reference Sources 


Weekly journal from the American 
Association for the Advancement of Science 

National Geographic 
http:/www.na tionalgeographic.com/ 

World Wide Web Virtual Library on 
Conferences 

http://conferences.rpd.net/ 

Melvyl - University of California Library 
Catalogue 

http:/ /melvyLucop.edu/ 

Library of Congress 
http://lcweb.lor.gov/catalog/ 

Book catalogue since 1898 to date 

USPTO and AIDS patents 

http://patents.uspto.gov/ 

provides free access to bibliographic data of 

US patents issued since 1/1/76 and the full 

text of AIDS related patents issued in US, 

Japan and Europe. 

IBM Patent Server 

http: / / patent.womplex.ibm.com/ 

Provides access to over 26 years of U.S. Patent 
& Trademark Office (USPTO) patent 
descriptions and last seventeen years of 
images. 

The Merck Manual 
http:/ / www.merck.com/ 

The Merck Manual is one of the most widely 
used medical text in the world. Written by 
over 300 exerts, it covers all but the most 


29 



obscure disorders. 


Science News Services 


Databases 


Internet-based Education 


Science Books 


Ena/clopaedia Britannica Online 
http:/ /www.eb.com/ 

Nobel Prizes 
http://www.nobel.se 

EurekAlertl 

http: / / www.eurekalert.org/ 

This is a comprehensive news service for 
up-to-date research in science, medicine, and 
engineering. 

Molecular Biology (gene, enz)ane an protein 
data banks) 

http: / / www.unl.edu / stc-95 / ResTools / 
cmshp.html 

The Globewide Network Academy 
http: / / www.gnacademy.org/ 

Offers distance education with over ten 
thousand courses and degree programs 

National Academy Press (NAP) 
http:/ / www.nap.edu/leadingroom/ 


30 



Appendix II 


Information Resource Finding Aids 


Resource Guide Type 


Examples 


Search Tools 

Use spider programs to gather 
millions of Internet resources an 
build large searcheable indexes. 
Good for locating specific sources. 


Altavista 

http://altavista.digital.com 
Indexes over 30 million Web pages 
Includes Usenet News group articles. 


Open Text Index 

http://index.opentextnet 


Subject directories + Search tools Lycos 

http:/ / www.lycos.com 

Provide both browsable subject 

Internet sources directories and Excite 

large indexes of gathered by spider http://www.excite.com 

programs. 

InfoSeek 

http://guide.infoseek.com 


Subject directories (multi-subject) 


Offer browsable hierarchical 
subject arrangement of Internet 
sources. Place to start if you do 
not have specific resources in 
mind. Many support kejrword 
searching of resources 
they cover. 


Argus/Univ.of Michigan Clearinghouse 
http:/ / www.clearinghouse.net 

A collection of topical guides which identify, 
describe and vciluate Internet based resources 

World Wide Web virtual Libraries 
http:/ /WWW. w3.org/hypertext 
DataSources 

/by Subject/Overview.html 


31 



A distributed subject catalogue of Internet 
sites 

Yahoo 

http: / / www.yahoo.com 
A comprehensive subject directory of over 
80,000 Internet resources 

Einet Galaxy 

http; / / galaxy.einet.net/ 

Planet Earth 

http://www.nosc.mil/planet_earth/ 

OCLC Net First (fee-based) 
http:/ / www.oclc/netfirst.htm 

Catalogues selected sites. Bibliographic 
information plus abstracts. Subject headings 
an LC and DDC classification numbers. 


Rating Services Point 

http; / / point.lycos.com/categories/ 
index.html 

Provide ratings and descriptions to the top 
5% Web resources 

Megellan 

http:/ / www.mckinley.com 

Provides ratings and detailed descriptions 

to resomrces in its subject directory 


Subject/Resource Type Specific Guides 

Organisations Scholarly societies, academies and federations 

http://www.lib.uwaterloo.ca/society/ 

overview.html 

Colleges and Universities home pages 

http://www.mit.edu:8001/people/ 
cdemello/univ.html 


32 



Discussion forums/lists 


Electronic journals 


Courseware 


Physics 

Biology 


Chemistry 


Engineering 


Diane Kovac's list of scholarly electronic 
conferences 

http://www.inid.net/KOVACS/ 

Liszt 

http://www.liszt.com/ 

Directory of E-Mail discussion groups. 
Covers over 54,000 listserv, listproc, 
majordomo and independently managed 
lists from over 1800 sites. 

ARL directory of electronic journals and 
newsletters gopher://arl.cni.org:70/ll/ 
scomm/edir/ 

Directory containing scholarly serials and 
electronic newsletter tiles. 

The world Lecture Hall 
http://www.utexas. eu/world/lecture/ 

Contains links to pages created by faculty 
worldwide who are using the Web to deliver 
class materials. For example, you will find 
course syllabi, assignments, lecture notes, 
exams, class calendars, multimedia 
textbooks, etc. 

TIPTOP-The Internet Pilot to Physics 
http://www.tp.umu.se/TIPTOP / 

Biology resource guide at Harvard 
http://golgi.harvar.edu/ 

National Biological Information Infrastructure 
http:/ / www.nbs.gov/nbii/ 

ChemCenter of the American Chemical Society 
http: / / www.ChemCenter.org/ 

WWW Virtual Library-Engineering 
http:/ / arioch.gsfc.nasa.gov/wrwwvl/ 
engineering.html 


33 



Mathematics 


Mathematics on the Internet 
http://e-math.ams.org/ 


Medicine WWW virtual library-Medicine 

http:/ / www.ohsu.edu/cliniweb / wwwvl / 

Mailing lists and newsletters Scout Report 

http:/ /rs.internic.net/scout/report 

For staying up to date on new A weekly publication offering a selection of 

internet resources new and newly discovered resources of 

interest to researchers and educators. 

SENN-Scientific and Engineering Network News 
http://www.senn.com/ 

A fee based, monthly guide to Internet 
resources for scientists and engineers. 

Internet resources Newsletter 

http:/ / www.hw.ac.uk/libWWW/irn/ 

irn.html 

Monthly publication with similar scope as 
Scout Report 

Info Watch 

http://www.ncsi.iisc.emet.in/ncsi 

infowatch.html 

Monthly publication reporting selected new 
resources on the internet 

New Jour 

Subscription address : 

mjd@ccat.sas.upenn.edu 

Announces new Internet available journals 

Best Web 

Subscription address : 
listserv@trcearnpc.ege.edu.tr Discussion of 
the best Web sites 


34 



Research in the E-mail and Internet Era 
An Example from the Life Sciences 

RAGHAVENDRA GADAGKAR 
Centre for Ecological Sciences Indian Institute of Science, 

Bangalore 560012 
and 

Animal Behaviour Unit, Jaivaharlal Nehru Centre for Advanced Scientific 
Research, fakkur. Bangalore 560064 
E-MAIL: ragh@ces.iisc.emet.m 

In what follows, an informal personal account is given of what the e-mail and 
internet era is doing to the way we conduct our research. It is now several years 
since we, at the Indian Institute of Science, have had access to efficient electronic 
mail and internet connectivity. This, coupled with powerful software such as 
Netscape and Microsoft Internet Explorer, has indeed brought about a rather 
sudden qualitative change in the strategies that one has to use to deal with routine 
day-to-day problems encountered by any researcher. By far the most important 
change that the internet era has brought about is that we now have, as never 
before, rapid, essentially unlimited access to information. And this applies to all 
kinds of information, be it simple facts about any scientific topic (such as the 
number of segments in the antenna of a honey bee for example!), the correct 
citation of a paper, the initials or address of a colleague, bibliography on any 
subject, information about scientific meetings and even opinion of a wide cross- 
section of the scientific community on any particular topic. Much of this information 
can be got by surfing the internet and if that fails, one can always send e-mails to 
electronic networks or discussion groups and get people to actively respond to 
your queries - and there will be plenty who will be willing to do so. 

Let me take a specific example from my recent experience. As many of you 
must have experienced, our President, Dr. S. Varadarajan keeps sending us 
newspaper clippings from the Times, London or some other such, not readily 
accessible sources. I have received my fair share of such, often intriguing, clippings 
and wish to take this opportunity to thank him for his generosity. Not long ago 

Text of talk delivered during the INSA Seminar on "Perspectives in Global Science and Technology 
Communication", held on 2"“ April 1996 at New Delhi. 


35 




he sent me a clipping from The Times, London, dated 27* January 1996. It said 
"Home News, Page 3 - Scientists create tiny anterma to keep eye on bees, by 
Michael Hornsby, countryside correspondent" and went on the say: 

BRITISH scientists have invented the world's smallest radar antenna to track bees and 
other low-flying insects. The device could improve the ^ciency of bee-keeping and help 
to combat the tsetse fly, carrier of sleeping sickness in central and southern Africa. The 
antenna which weighs three milligrams and is 16 millimetres high, is glued to the back of 
the bees. Field trials show the creatures can fly normally with the extra load but have some 
difficulty entering their hives." 

This was fascinating and it even had a picture of a bee with a radar anterma 
glued to its back but, like all newspaper stories, it said nothing about where I can 
find out more about this Joe Riley and his work which is so relevant to what I am 
doing with wasps in Bangalore. In the past, I have often been frustrated by the 
great difficulty in following up a story from the popular press. All one could do 
was to ask around casually and it may be years before you have a chance to have 
coffee with someone who might know more. But now we live in the e-mail and 
internet era. I tried to search on my own on the internet and having failed to come 
up with anything, I sent off the following e-mail to the Social Insect Electronic 
Network of which I am a subscriber. 

"Dear Friends, 

I recently read the following report in The Times, London and thought it might 
interest some of you. I would like to contact the scientists involved in this work and read 
the original papers/reports. Does any one have any ideas on how to find the address of the 
people involved? 

I would appreciate any help in this matter. Thanks, 

Raghavendra Gadagkar" 

The next day, I was flooded with responses. One of them said: 

"Dear Prof.Gadagkar, 

You were inquiring on SOCINSECT-L about tracking foraging bees with harmonic 
radar. The work was carried out at Rothamsted Experimental Station, by Dr Juliet 
Osborne, Prof Ingrid Williams and Colleagues, in collaboration with the NRl at Malvern. 
Juliet's e-mail is: Juliet.osborne@bbsrc.ac.uk 


36 



Her postal address: 

Entomology and Hematology 
Rothamsted Experimental Station 
Harpenden 
Herts, UK.AL5 2JQ 

Yours Sincerely 
Catherine Williams 
Dept of Zoology 
Downing Street 
Cambridge CB2 3EJ" 

Another response, sent in fact by a friend of mine at the Michigan State 
University Said: 

"Raghavendra Gadagkar asked for more information about the use of harmonic radar 
to track bees. See Nature 4 Jan 96 (vol 379, pp.29-30). This is the publication the news 
story picked up. It is short (scientific correspondence) article, but it shows pretty nice 
pictures of what this tracking system can do. The range, by the way, is less than 500m. 

Fred" 

Notice that the second one was addressed to every one on the social insect 
subscriber list so that the answer to my question was available to all those who 
had read my question. 1 received many more responses, made siure to thank all the 
respondents, and went over to the library to get the relevant copy of Nature, less 
than 24 hours after I had received the newspaper clipping from Dr. Varadarajan. 

So much goes on in these electronic discussion groups that tiiere is great scope 
to stay in touch with the latest information and opinion, often without even 
having to go to the library. Although these debates can be very stimulating and 
informative, I worry that well-researched text books and peer-reviewed research 
papers will begin to take a back seat and what is said by somebody on an 
electronic discussion group will begin to sway opinion. Again, let me take an 
example from my recent experience. 

First let me give some background information. In 1946, tiie Austrian zoologist 
Karl von Frisch first reported that honey bees use a dance language to commxmicate 
to their sister bees, information about the direction and distance from the hive, of 
a new source of food they may have foxmd. Many experiments, mainly by von 


37 



Frisch and his students have since confirmed the existence of the honey bee dance 
language. Honey bees appear to have the ability to measure the angle between the 
azimuth of the sun, the hive, and the source of food, remember this angle, return 
to their nest and perform a so called waggle dance on the vertical surface of the 
nest. When the waggle run is pointed down, the food is in exactly the opposite 
direction and if the waggle run is 80 degrees to the left of vertical, then the food 
is 80 degrees to the left of the sun. By knowing the orientation and the duration 
of the waggle nm made by the dancing forager, a bee attending the dance can 
know the distance and direction of the food and can indeed, go and find it, 
without any further help from the dancer. This spectacular discovery fetched Karl 
von Frisch the Nobel Prize in 1973 (even though the prize is somewhat naively 
meant only for Physiology or Medicine!). 

Starting from 1967, two researchers from California, Adrian Wenner and 
Patrick Wells have challenged von Frisch's dance-language hypothesis and claimed 
that bees find sources of food based exclusively on odour left by the original bee 
and that even if bees perform a dance, this is never really used by the recruits in 
any way. This challenge has led to many more rigorous experiments and von 
Frisch's dance-language hypothesis has been, I think, rather convincingly 
vindicated. But Wenner and Wells remain unconvinced. 

The whole issue of the dance^language controversy was raked up by someone 
on the social insect network recently. Many statements for and against the dance- 
language hypothesis, for and against the Wenner-Wells position were exchanged. 
Things came to a head when Adrian Weimer sent the following provocative mail 
to the list: 

"From:Adrian Wenner«'ivenner@LIFESCI.LSCF.UCSB.EDU» 

Despite popular opinion, objectivity simply doesn't seem possible. 

"However broad-minded one may be, he is always to some extent the slave of his 
education and of his past." Emile Duclaus(a biographer of Pasteur) 

"It is a common failing... to fall in love with a hypothesis and to be unwilling to take 
No for an answer. Peter Medawar. 

Evidence usually matters little during the heat of a controversy Proponents of a 
hypothesis embrace one set of evidence. Opponents focus on other evidence. 


38 



Some of the "hard sciences "have a tradition in which supportive evidence does not 
matter much if a hi/pothesis fails a real test (witness "cold Fusion" research). 

A honey bee dance language hypothesis no longer seems to exist." 

Since there was a great deal of confusion with angry mails going back and 
forth, Fred Dyer, a prominent bee dance researcher, decided to take a vote and sent 
the following mail: 

"Please send me a brief e-mail stating whether you AGREE or DISAGREE with this 
quoted statement. I will tabulate the results and report them to the group in I week. 

Main Question: 

"Honey bee recruits can use the directional and distance information contained in the 
dances of foragers to bias their search for food." 

Optional (if you respond to either of the following statements, please respond to both): 

1. "I have r^ad 'The Dance Language and Orientation of Bees', by von Frisch (1967), 
one of Gould's review articles from the mid-70s, or other papers reporting evidence 
for the dance language hypothesis:. Answer YES or NO. 

2. "I have read" Anatomy of a controversy" by Vdenner and Wells". Answer YES or 
NO. Nothing you say can or will be used against youl 

Fred" 

And some days later he gave us the results: 

"From: "Fred C. Dyer",fdyer@JEEVES.UCSD.EDU» 

The result are in! Thanks to all who replied. 

Of about 262 subscribers to the list, only 26 replied, including me. The tally: 

Main Question: 

"Honey bee recruits can use the directional and distance information contained in the 
dances of foragers to bias their search for food." 

AGREE:26 DISAGREE:0" 

I presume nobody had read Frisch, Gould or Wermer and Wells! In arty case 
Wenner's response was, I think most ungracious. He wrote: 


39 



"tvith our present anonymous review systems still in place, would any wise person 
willingly vote openly against an opinion held by even 26 potential reviewers of manuscripts 
and grant proposals?" 

And Fred Dyer replied bitterly: 

"The statement would be insulting if it weren't so absurd. On the other hand, may 
be it isn't so absurd ...I know, let's have another survey! How many people would willingly 
vote openly against an opinion held by 26 potential reviewers of manuscripts and grant 
proposals? (PLEASE don't reply - I'm retiring from the survey business. I have better 
things to do, like figuring out more ^ective ways than surveys to identify people who 
disagree tvith my opinions.) Fred" 

Although I found this whole debate interesting and entertaining, I was struck 
by the fact that hardly any hard scientific data were discussed - it was really an 
expression of opinion and gossip by various people. But I am afraid that does not 
always prevent newcomers from making up their minds. Somewhere during this 
debate on the net, I saw the following mail: 

"From: MARLENE SNYDER, «marlene.snyder@ACADIAU.CA» 

Hi, I USED to think the evidence was strong for the language [hypothesis] - the 
current debate is interesting to me and I think I may be swayed to the Wenner side. Marty 
Snyder" 

The internet revolution is also boimd to upset the traditional Student-Professor 
equations. Until recently the Professor almost always had priority access to new 
information - on account of his travels, attending conferences, refereeing papers 
and grant proposals, corresponding with colleagues and so on. Students often 
depended on their Professor for the latest information - especially information 
that has not yet found its way into print. But things are changing rapidly. Alas, 
internet is not partial to the Professor. Indeed students seem to have all night to 
surf on the net and are now ahead of their Professors in the race for information. 
If the Professor who is overburdened with administrative responsiblities, gets to 
hear the latest scientific results from across the globe, from his students, that 
would be most welcome. If the Professor wakes up at 4 o' clock in the morning 
to write a learned text book but the student is making up his mind on crucial 


40 



issues, by listening to un-refereed gossip on the net, that would be a shame indeed. 
But I think both are bound to happen and we need to gear up to the challenges 
of the internet era - there's no such thing as a free lunch! 

Reference 

Gould, J.L. and Gould, C.G. (1988) The Honey Bee, Scientific American Library, New York. 

Wehner, A.M. and Wells, P.H. (1990) Anatomy of a Controversy - The Question of a "Language" Among 
Bees. Columbia University Press, New York. 


41 



Digital Libraries 

V. RAJARAMAN 

IBM Professor of Information Technology, Jawaharlal Nehru Centre for Advanced Scientific Research 
and Hon. Professor, Supercomputer Education & Research Centre 

Indian Institute of Science, Bangalore 560 012 


1 . Introduction 

All of us use libraries. Libraries classify, index and store books, encyclopaedias, 
periodicals, technical reports, conference proceedings, microfilms and tapes, etc. 
Of these, textual material, namely, books, periodicals, reports etc., are the ones 
which are most commonly used. Films, video tapes etc., are not very commonly 
referred to by users due to difficulty in handling and special facilities needed to 
use them. Besides a collection, a library also has specialized staff to assist readers 
with their requirements and the ambience to bring together scholars to collaborate 
on projects. Many libraries also provide selective dissemination of information to 
scholars and bring to their attention new developments in their area of work. 

Over the years libraries have faced many problems among which are lack of 
space as volumes of books and journals increase, rising costs of books/journals 
and the inability to provide multiple copies of journals and reference books which 
many readers would like to read simultaneously. Apart from this, libraries have 
not found a good way of handling non-text meterial and this have led to their 
under-utilisation. 

Since the advent of computers, information professionals have been trying to 
use them in libraries. The most common early uses were in cataloguing and 
circulation control. Later abstracting services started producing tapes containing 
abstracts of articles which were very useful in disseminating information to users 
based on their interest profiles. 

There has been a convergence of a number of developments in computer 
technology in the last five years which has significantly affected the way computers 
can be used in libraries. These developments are: 


42 



• Emergence of CDROMs (Compact Disk Read Only Memories) and now 
DVDROMs (Digital Video Disk Read Only Memories) with very high 
information storage capability. One DVDROM can store upto 7.5 Giga 
bytes (7.5 x 10® bytes) (To store a typical 500 page book 0.25 Mbytes are 
needed). The cost of these storage devices is very low, around ten paise per 
Megabyte. 

• Development in computer network technology which has facilitated 
interconnecting computers not only within the country but also across 
countries leading to a world wide computer network. It is possible today 
to access any computer in the world connected to the network from any 
place. 

• Method of digitizing, compressing and storing audio, graphics and video 
data. Standards have emerged for the format of audio, graphics and video 
data compression and storage which allow easy interchange of these data. 

• Advent of very powerful processors which can process multimedia 
information very fast. 

• Availability of high resolution video terminals which can display information 
on multiple windows. 

When all the above developments are combined we have a powerful technology 
to efficiently store multimedia information available in geographically dispersed 
locations, index them for easy retrieval and access the information from anywhere 
in the world using a Personal Computer connected to the international network 
of computers. This technology has led to the concept of a Digital Library. In this 
article we will answer the following questions: 

What is a digital library? 

What technologies have led to the development of digital libraries? 

What are the unique advantages of a digital library? 

How will digital libraries affect the work of scientists? 

What are some of the on-going digital library projects in the world? 

2. What is a Digital Library? 

There is no universally accepted definition of digital library. We will give our 
definition which succinctly brings out its special characteristics. A digital library 


43 



is a collection of textual, numeric, graphic, audio and video data stored (in a 
computer) in digital form, indexed and logically linked for ease of retrieval. The 
collection is geographically dispersed and linked by a communication network 
enabling retrieval of desired information by a large number of users who may be 
located in many places. The users will normally access the information they desire 
using a terminal or desktop computer at their place of work (or home). 

The main components of a digital library are: 

DATA 

• Textual data - This consists of books and journals stored in a digital form 
in a computer's disk store. There are two ways of storing this information. 
One way is to photograph a page and scan the image with a scarmer. The 
scanner digitizes the image storing a 0 for white and 1 for a dark spot. For 
good resolution one page will be represented by (800 x 1000) bits (or 100 
Kbytes). This form of storage is called a bit mapped form. Bit patterns do 
not carry information for indexing. This is, however, the only practical way 
of storing old manuscripts, texts and journals. The image of a page may be 
retrieved and displayed on the video screen of a computer. 

The other way of storing a text is to represent each character by its ASCII code. 
Texts generated using a word processor are already in this form. Most books and 
journals produced in the past few years will already be in this form. If a page has 
6000 characters it will need 6000 bytes of storage. Further, it will be easy to index 
the document using arbitrary words in the text. If a table has numeric information, 
the numeric data would be stored in coded form which allows it to be processed. 
Photographs or other complex figures in the text however will have to be scanned 
and-stored as bit maps. 

As it requires less storage to store text in ASCII coded form software is 
becoming available to scan printed texts using a scanner and convert them to 
coded form. Conversion by such software is, however, not 100% accurate and 
manual correction is required before the text is stored. For old texts using non 
standard or mixed fonts and for hand-written manuscripts such conversion software 
is not available. 

• Numeric data consist of tables of various types such as physical property 
data of various materials, data from experiments, astronomical tables etc. 


44 



Such numeric data stored digitally may be used (if required) by curve 
fitting programs, spread sheet programs etc. 

Graphics data may be photographs, maps, drawings etc. The simplest way 
of storing such data is to scan the image and store it as a bit pattern. There 
are better ways of coding and storing maps, drawings etc., which abstract 
the information contained in them. For example, maps may be stored using 
longtitude/latitude as coordinates of cities, a linked list depicting road 
network etc. Data stored in this form eases retrieval. 

Photographs (both colour and monochrome) are stored in bit mapped form 
using compression algorithms to reduce storage space. 

Audio data are digitized, compressed using a commonly accepted standard 
compression algorithm and stored. Musical scores may also be coded and 
stored along with the audio data (if required). 

Video data requires enormous storage space due to the need for repeating 
frames atleast 30 times per second. Thus the data are compressed in such 
a way that when decompressed the original data is recovered. Common 
standards for compression have been evolved. The current standard is 
called MPEG-2 (Motion Picture Experts Group Version 2) and compresses 
one 90 minute video movie to occupy 7 Gbytes. 

INDEXING 

Indexing and interlinking multimedia data is extremely important for 
ease of retrieval. Key words in textual documents are selected and linked 
to related words with logical links by appropriate software. This is called 
a hypertext. For material in other media (audio, video) also, related elements 
are selected and linked in what is known as hypermedia. Such links would 
allow an user to navigate through multimedia material. For example, from 
a multimedia encyclopaedia stored in a CDROM one may request 
information on the Taj Mahal. The computer would search the data and 
retrieve a page giving textual information about Taj Mahal which would be 
displayed on the video screen. If there is a reference to music in the text it 
may link to an audio clip giving a recording of classical music of that time. 
Links may also be present to video clips on Taj Mahal and related subjects. 


45 



LINKING 


The "information collection" of the digital library will normally not be 
stored in one computer. It will be distributed in many computers known 
as servers. All these servers will be linked by high speed communication 
links. The fact that the information in the digital library is distributed need 
not be known to a user as it is not relevent from his/her point-of-view. A 
user gets a "seamless" access to the information based on his/her request 
regardless of its geographical location. 


USER 

• A user may access the library from an)Avhere using a terminal or a computer 
coimected to the network to which the information servers of the library 
are connected. 

To summarise, the key components of a digital library are: 

• A large collection of digitized and compressed multimedia data. 

• All data logically linked together and indexed with key words (or elements) 
to enable easy search and retrieval. 

• The information collection is geographically distributed on a computer 
network. 

• Users are geographically distributed and connected to the network. 

• Seamless access to the information in the network to users. 

3. Technologies Which Enabled Creation of Digital Libraries: 

Last few years has seen the phenomenon of internet - an interconnected 
worldwide network of computers. All computers connected to the internet follow 
a standardized common protocol (a set of rules) to communicate with one 
another. 

The internet provides facilities to send and receive electronic mail (called e- 
mail) which is widely used. Internet also supports a file transfer protocol 
(abbreviated ftp). Directory of files (which may be text, audio, graphics or video) 
resident in any computer in the network may be searched and a desired file may 


46 



be selected and transferred to another computer by the ftp program. Directory of 
files and their locations (address of the computer where they are stored) are 
available in the internet itself. 

As the amount of information on the internet is huge (may be several million 
files) it is essential to have some method of locating the desired file and searching 
it by using content descriptors. Tools known as search engines have been developed 
and are easily available from the internet itself. Some of these search tools have 
picturesque names such as, Gopher, Archie, Veronica, WAIS, Yahoo, etc. 

To allow browsing of information easily on the internet graphical user interfaces 
(abbreviated GUI and pronounced gooyee) have been developed. For textual 
information the idea of hypertext is used. In a hypertext key words in each 
document are highlighted and linked to other documents where the same keywords 
or related words occur. By moving a mouse to point to a word and clicking it, the 
GUI allows a user to navigate from one document to another. The documents may 
reside in any computer on the network. This idea can be extended to graphics, 
video and audio information also. 

A hypertext system used to link information stored on many computers is 
called the World Wide Web (abbreviated WWW). One searches for information in 
WWW with a program called a browser which assists in displaying hypertext 
documents, identifying hypertext links and retrieving linked files (multimedia). 
Two of the popular browsers are Netscape and Mosaic. Thousands of commercial 
enterprises, newspapers (e.g. The Hindu) magazines (e.g., India Today), organizations 
and individuals maintain a location with an address (called a home page) on the 
web. Each page has its own unique web address called URL (Universal Resource 
Locator). All the web pages are written using a special language (or a notation) 
known as Hypertext Markup Language (HTML). HTML allows hypermedia links 
using URLs. The web page of the Indian Institute of Science, for example, has a 
URL (or web address'*: 

http:/ / www.iisc.emet.in 

In this address http stands for hypertext transfer protocol as the file transfer 
on the web follows this protocol (namely, commonly agreed set of rules for data 
transfer among computers). 

HTML is a specific implementation of an international standard for defining 
device-independent, system-independent method for representing texts in 


47 




Figure 1: A Digital Library System 

electronic form using descriptive markups known as SGML (Standard Generalised 
Markup Language). Meiny other markup languages for a variety of document 
types such as manuals, books, journals etc., are emerging based on SGML. 

A schematic diagram of a digital library system is shown in Figure 1. We see 
from this figure that the iiitemet is an important part of a digital library. The other 
parts of a digital library are: 

Servers which are primarily computers coimected to local networks within 
orgcinizations which have large storage devices connected to them. The storage 
devices are normally DVDROMs (Digital Video Disk Read Only Memory) which 
store over 7.5 Gigabytes each. As reading information from these devices is slow, 
faster magnetic disk drives are used as buffers. Related information which may be 
needed by a user are retrieved from DVDROM and stored in the buffer so that the 
time to access information is, on the average, reduced. 


48 




Data Conversion and Compression Hardware particularly for storing audio 
and video information. 

High resolution video displays with multiple windows and good graphical 
user interface. 

Companies such as IBM are taking up projects to integrate all these technologies 
and create a software environment to design digital libraries. 

4. Unique Advantages of a Digital Library 

The fact that all types of data are in a digital form and data is distributed and 
accessible from an 5 rwhere endows some imique advantages to a digital library. We 
discuss some of them in this section. 

Unlike traditional libraries documents are not physically handled by a user. 
A user views the necessary document and prints the portions of interest in his/ 
her location. Further, many users can simultaneously view a document. Thus 
documents are not "lost" due to theft or misplacing. Documents do not tear due 
to wear. Library need not have multiple copies of popular journals. Rare manuscripts 
may be allowed to be viewed by many as only images are accessed by users. 

Unlike printed text where tables of numbers can be studied but not easily 
processed, in a digital library the numbers in a table can be processed as they are 
in digital form. In print material numbers are "passive" whereas in digital form 
the same numbers are "alive" as they can be processed and transformed. For 
instance, a table may be used in a spread sheet program and we can find out how 
altering some entries change other entries. A user may also try to fit appropriate 
curves to a set of numbers in a table and give his/her own interpretation of the 
data. 

A digital library can store unconventional information such as readings from 
some scientific instruments such as spectrometers. All modern instruments 
incorporate microprocessors and experimental data are already in digital form. A 
repository of experimental data can be stored in a server and made available to 
others to assist in their work. 

Searching for information is much easier in a digital library system due to 
global indexing and use of hypertext and search engines. Fast communication 
networks allow widespread search for required material. 


49 



The ability to digitally store and retrieve multimedia data allows a digital 
library to effectively provide access to audio and video information. Providing 
such access is particularly difficult in current libraries due to difficulty in handling 
such material and wear and tear if many persons access the material. Digital 
libraries also provide an effective method to preserve rare manuscripts, vintage 
movies, old music, etc., without denying access to users. 

Digital libraries provide users access to current information which normally 
takes a long time to reach a traditional library. For example, many authors store 
their latest research reports in their web page and permit free access. Many 
conference papers may be found in World Wide Web. A digital librarian can collect 
articles of interest to a research group and provide it in a local server thereby 
saving users' time. 

The internet allows easy access to discussion groups in specified subjects. 
Users of a digital library czin find researchers working in similar areas and attempt 
collaborative work. A digital librarian can enable this collaboration. 

We are also seeing the emergence of journals published only in "electronic 
form" with no print version. This reduces cost and allows prompt, wide 
dissemination of the journal. Some journals also provide means for readers to 
criticise the article and link them with the article. Such informal review is 
unorthodox but is very useful for a prospective reader. 

Many other types of information not foimd in a traditional library may be 
stored in a digital library such as computer aided lessons, lecture demonstrations 
by musicians, dancers and artists, video recording of important speeches (such as 
Nobel lectures) etc. 

To summarise the unique features of digital libraries are:- 

• Safe storage and rnultiple access of material. 

• Ability to process numerical data published in the literature. 

• Ability to store variety of data such as audio, video, graphics, data output 
from experiments, computer aided lessons, lecture demonstrations by artists^ 
famous lectures etc. 

• Ability to access information "hot off the press." 


50 



• Accessing information available anywhere in the world from an 5 rwhere 
else in the world. This alleviates to some extent the disadvantage felt by 
scientists living in remote areas. 

• Ease of search and retrieval. 

5. How Will Digital Libraries Affect the Work of a Scientist? 

The major impact of digital library system on a scientist will be the need to have 
ready access to such a system if one has to keep up-to-date. As many scientists will 
place most recent work on a web page, easy access to it is essential. This implies 
that a scientist should be connected to the internet and have rapid access to web 
pages. This necessitates good communication infrastructure in a country. The 
current situation in India in this respect is bad and one needs low cost, wide band 
communication. 

In the last three decades we have seen an acceleration of scientific development 
and proliferation of publications. The rapidity of change and proliferation of 
material will be accentuated. A scientist will be deluged with information and has 
to find a technique of handling this "information overload". We will see the 
emergence of "software agents" which can be tailored to satisfy the needs of 
individual scientists. Such an agent will filter out irrelevant information and seek 
out relevant information from the system. We will see the widening of gap in the 
working conditions of scientists with regard to availability of information. The 
"haves" will have excellent communication and computer facilities with a variety 
of software agents while the "have nots" will be isolated due to poor communication 
and consequently non-availability of published material. 

Currently, a large number of journals insist that articles be submitted in 
machine readable form to expedite publication of journals. Many journals now 
have a hard copy and an electronic form, that is, articles stored digitally in a server 
and accessible via the internet. Such an access is particularly good for scientists in 
India as it reduces postal delay. Print form will remain, as it is easy to read. We 
are also seeing the emergence of purely electronic journals (without a print version). 
Pure electronic publications will increase in number as it is cheaper and convenient 
to publish and scientists will find it increasingly important to have a good internet 
access to refer to these journals. 

Another important issue which will appear will be the charges for the use of 
a digital library system. Currently most of these projects are funded by Governments 


51 



and no charging policy exists. In the long run, however, one may have to pay for 
accessing and downloading articles. A scientist browsing through many articles 
may have to pay a substantial amount and this may inhibit browsing. This will 
particularly affect scientists in India who have limited budgets. 

Digital library systems will have directories of scientists working in similar 
area all over the world. Access to such directories would promote collaborative 
work. With increasing bandwidth of international networks, video conferencing 
over these networks will also be possible. Thus a scientist in India will find it 
easier to collaborate with scientists elsewhere in the world. Access to electronic 
discussion groups will also enable scientists in India to know about current trends 
(topics; of research etc.). 

One of the most interesting things that may happen is the availability in 
digital form of a variety of experimental results. All sophisticated modern 
instruments used in experiments employ microprocessors and have digital readouts. 
Thus it is easy to store results of experiments in a digital "server" and provide 
access to this data to other scientists. Groups of scientists may collaborate to store 
the data in a repository which may then be used in various research projects. Apart 
from such experimental data there are other areas where digital access to a large 
group of scientists would expedite progress in an area. An example is the human 
genome project in which gene maps are being provided by international groups. 

We hasten to add that printed journals and libraries as we know today will 
remain for quite some time to come. The ease of reading printed material, 
serendipitous search in traditional libraries and human contacts in libraries cannot 
be replaced by digital library systems. 

6. Ongoing Digital Library Projects 

There are large number of digital library projects all over the world. A special issue 
of the communications of the ACM describes many of these. We will summarise 
some of them in this section. Most of these projects are funded by NSF/ARPA/ 
NASA of U.S.A. 

Digital Video Library {Carnegie Mellon University) 

This project's aim is to develop new technologies for creating a digital video 
library and software to search the content and retrieve relevant information. The 
library will initially contain 1000 hours of video from the archives of an educational 
TV station and BBC video course. It will be initially used by high schools. 


52 



The Stanford Digital Library 

The aim is to develop enabling technologies for an integrated "virtual library" 
which will link heterogeneous collections such as small personal information 
collections, collections of large conventional libraries and large data collections 
shared by scientists. 

University of California, Berkeley's Digital Library 

The aim is to create the technologies to store very large multimedia data bases 
and intelligent retrieval mechanisms. As a prototype, a collection of California 
environmental data consisting of technical reports, computer models, aerial and 
ground photographs, database of California bio-system, maps and state's plans 
will be stored. Intelligent retrieval techniques will be tried on this prototype. 

Alexandria Digital Library 

The aim of the project at the University of California, Santa Barbara is to 
develop a distributed system that provides a comprehensive range of library 
services for collections of spatially indexed and graphical information such as 
maps and images. Users will be diverse, ranging from school children, to 
researchers, to general public. 

Illinois Digital Library 

This project aims to store large collection of science and engineering journals 
(complete contents) and develop good display and search systems. 

University of Michigan Digital Library 

This project will emphasise a diverse collection with focus on earth and space 
sciences which can satisfy the needs of many different types of users. A related 
project will digitize and store all available issues of 10 journals on economics from 
the first publication to 1990. 

The British Library Access 

The aim is to investigate software and hardware platforms for digitization 
and networking. The programme will establish standards for storage, indexing, 
retrieval and transmission of data and will examine the copyright issues involved 
with digitization of material and its provision over networks. Some of the major 
projects are to provide access to 34 million patents held by the library, store 
unique manuscripts of the 11th century Anglo Saxon epic Beowulf digitized at 
a resolution of 2000 x 3000 pixels in 24 bit colour. Access will be provided via 


53 



the internet. Another project is an electronic photoviewing system of'the library's 
photo archives. 

HSc Digital Library 

There are several other projects which are coming up at several places. Closer 
home the Indian Institute of Science and IBM will be collaborating to create a 
digital library which will digitize and store the Journal of Indian Institute of 
Science and research publications of the faculty. Besides this there is a plan to 
duplicate and store locally the information on some frequently used international 
file servers. This will reduce communication cost. 

7. Conclusions 

We have seen that the emergence of many technologies—^both software and 
hardware —has led to a revolution in the way information is created, stored and 
disseminated. Rapid international access to the variety of information distributed 
across the world will to some extent alleviate problems faced by scientists in the 
developing world with regard to information availability. There are, however, 
many problems which need to be resolved before a worldwide digital library 
system develops. Some of these are:- 

• Cop 5 rright problem. It is easy to copy digital information. Methods have to 
be found to prevent illegeil copying without inhibiting legal users. 

• Material being digitized for digital libraries are in both page image form 
and ASCII coded form. ASCII form is preferable for wide use of textual 
matter but technology is not yet available for digitizing archival material. 
Information currently available in CDROMs such as encyclopaedia and 
dictionaries will have to be made available online. 

• A disturbing aspect, particularly for developing coimtries, is the rapid 
obsolescence of both hardware and software. This would put higher financial 
burden in changing equipment. 

• Special precautions need to be taken to prevent corruption of data by 
vandals bent on mischief. 

• Lastly, we are not used to the idea of paying for reading an article in a 
journal. The question of who pays when information is accessed and how 
much payment is considered reasonable has to evolve. Costs determined 


54 



by the developed world may be too high for a scientist in a developing 
country. One has to resolve these issues soon. 

To conclude, digital library systems are an exciting new source of information 
for scientist and will grow rapidly. Scientists in India have to be prepared to make 
best use of these new facilities. 

References: 

1. Commun. ACM, U.S.A. 38, No.4, April 1995 (Special Issue on Digital Libraries). 

2. IEEE Comp., 29, No.5, May 1996 (Special Issue on U.S. Digital Library initiative). 

3. Fox, Edward, Digital Library Source Book, 1993 (available at web address: http://fox.cs.vt.edu/ 
/DLSB.html) 

4. Berkeley Digital Library Sunsite (Information on digital libraries) (Web address: http:// 
sunsite.berkeley.edu). 

5. Rajasekhar T.B., Digital Libraries, Resonance (Indian Academy of Sciences Journal on Science 
Education), 2 No. 4, April 1997 

6. Fox, Robert, Tommorrow's Library Today, Commun. ACM, 40 No.l, Jan. 1997, pp.20-21. 


55 



Access Control Security 

SATISH SUKUMAR 


Planet Asia.com, Microland Garden 
777 HI Cross, 18th Main, Korartmngala, Sixth Block, 
Bangalore-560095 


Introduction 

Intranet technologies based on the TCP/IP protocol bring tremendous benefits to 
the corporate in terms of enhanced commimication, information sharing and 
collaborative computing. Phenomenally high returns of investment figures have 
been observed with the deplo)anent of intranets. An extranet involves the connectivity 
of the corporate Intranet to the ^ptemet to provide users seamless access to information 
and to be able to communicate and collaborate within and without the organization. 
It is essentially becoming more important to be able to access a greater number of 
services offered on the internet as well as provide them on the intranet to a greater 
number of users, to provide that little edge over competition. There is a whole new 
class of applications being designed and deployed. A typical intranet of today 
includes mechanisms to store, search for and retrieve information of amazing variety 
including text, images, audio and video; directory and yellow page services, 
collaborative and communication services and also enterprise class applications 
which allow the systems of old to coexist with the new. 


But there be dragons both internal and external 

However, as is now very widely accepted, among the other denizens of the 
internet, lurk crackers who, for a variety of reasons enjoy breaking into various 
computer systems. 

Steven M Bellovin 
AT&T Bell Laboratories 

These technological mercenaries are hired to conduct computer espionage and 
sabotage using sophisticated software techniques. What is also very alarming is 


56 



a growing number of 
security violations by 
trusted users on the 
inside of the security 
perimeter. 

Add to this a 
whole host of 
weapons of attack, 
which are freely 
available on the 
Internet; and a 
network admini¬ 
strator has a full time 
nightmare. 

Designing and 
implementing a 
security solution for a 
network admini¬ 
strator is like being 
caught between a 
rock and a hard place. 

If they are too lax they 
are asking for trouble; 
if they are too tight 
the business suffers. It 
is hence essential to 

strike the right balance based on the understanding of a whole array of technology. 
This white paper looks at the various methods of deploying Access Control to 
Network resources, which is the first step in the deployment of an enterprise wide 
security policy. 

Defining a Security Policy 
Elements of A Security Model 

The following are the basic elements of information security architecture. 



57 



1) Access Control - Controlling access to network resources 

2) Authentication - Identifying who the user is 

3) Authorization - What can the user do, what are the user's rights and 

privileges 

4) Data Privacy - Protection of data from eavesdropping or unauthorised 

reads 

5) Data Integrity - Protection of data from unauthorised modification 

6) Non-Repudiation - Proof of ownership 

Any security model should be based on the following basic principles: 

• The technology used is solid and hence is itself not a security risk 

• The security mechanism is simple and easy to manage 

• The security policy should be aligned with the organization's business 
direction 

The above elements are deployed keeping 3 targets in mind: the user, the 
network resources and the O/S and applications. 

A security policy is a 
matrix, which links the 
basic elements with the 3 
targets. It hence deflnes a 
set of relationships 
between the user, the 
network resources, the 
O/S and the applications 
and attempts to deploy 
measures to secure these 
relationships. It is based 
on solid technology and 
is easy to use and 
manage. The security 
policy is always aligned 
with the organization 
business direction. 


58 




Since the design of security policies involve a lot of study into relationships, 
which are inherently complicated, the following guidelines are always found 
helpful;- 

1) Keep it simple. 

2) Prioritize what you must do. 

3) Educate the user. 

4) Interpret the policy in simple terms. 

5) Back the security policy with the highest office in the organization. 

A complete security envelope for an enterprise would involve the following 
technologies: 

Access control and authentication mechanisms to keep out intruders 

Transaction systems to secure end to end transactions and guarantee the 

privacy and integrity of these transactions 

Mechanisms for secure payment, which allows organizations to conduct 

business across the net. 

The classical security requirements are 

1) Confidentiality 

2) Authentication 

3) Data Integrity 

4) Access Control 

5) Non Repudiation Services 

6) Protection against Malicious Software Programs 

7) Protection against Denial of Service Attacks 

Unauthorised disclosure of information via eavesdropping and wiretapping 
is perhaps the most common threat that comes to one's mind when one thinks 
about network security attacks. Confidentiality service is used to protect 
information from xmauthorised disclosure. In a network environment, often it is 
important that such a service is provided in an end-to-end manner, thereby 
ensuring that the information is protected over the complete network path. 

In a masquerading attack one entity pretends to be another and attempts to 
gain privileges and access to information and resources to which is not authorised. 


59 



User and origin authentication services can be used to counteract such attacks. 
Mechanisms used to realize this service include the use of challenge-response and 
cryptographic techniques in the implementation of secure authentication protocols. 

Another common attack is the unauthorised access to network resources. 
This can involve network components such as printers or network resources 
such as operating systems/ databases and applications. Access control service 
provides protection against unauthorised access and use of resources in a network 
system. 

The threat of unauthorized modification of information and resources causes 
integrity violation. Such an attack may involve unauthorised insertion and deletion 
of information transferred over the network. This attack often occurs in conjunction 
with other attacks such as replay whereby a message or part of a message is 
repeated intentionally to produce an unauthorized effect. Integrity service provides 
for the protection of information from imauthorised modification. 

Repudiation of actions is another form of attack that can occur in a networked 
system. It occurs when a sender (or a receiver) of a message denies having sent 
(or received) the information. The non-repudiation security service that can be 
used to counteract such a threat provides proof of the origin or delivery of 
information. Provision of such a service requires some form of digital signature 
mechanism. Such a service also implies the existence of an agreed trusted third 
party whose primary role is to arbitrate disputes resulting from non-repudiation. 

Malicious software programs, which are, introduced into computer assets for 
the purposes of stealing confidential information, espionage or sabotaging computer 
systems. Data protection services, which identify and destroy these programs, are 
one of the coimter measures available. 

Unauthorised denial of service attack by an entity involves the denial of a 
service to another entity even though the latter is authorized to access that service. 
That is, an entity prevents other entities from carrying out their legitimate functions. 
In a network, this form of attack may involve blocking the access to the network 
by continuous deletion or generation of messages so that the target is either 
depleted or saturated with meaningless messages. Denial of a service can be 
regarded as an extreme case of information modification in which the information 
transfer is either blocked or drastically delayed. The measures provided by 
confidentiality, integrity and authentication services can be used to detect some, 
but not all, forms of denial-of-message service attacks. For instance, they carmot 
detect such attacks if they begin while the communication association is quiescent. 


60 



An example S YN-Attack used for Denial ofServico 


Attacker I’irst Masquerades iis B tlien 

Sends S YN packets to SERVER 


SYN 



The inmiber of 
1/2 open TCP/IP 
Connections on 
t SERVER increase. 




In such a situation, 
the receiving entity 
has no way of 
determining when 
the next information 
should arrive. In fact, 
it will remain 
unaware of the attack 
until it attempts to 
send information 
itself. In many cases, 
the entity attempting 
to send the 
information will 
detect the attack but 
it has no way of 

notifying the other entity. A measure against such an attack is to have periodic 
exchange of information between entities to ensure that an open path exists 
between them. The greater the frequency of such a request response mechanism, 
the shorter the time period during which the denial-of-service attack will remain 
undetected. However, the disadvantage is that this reduces the effective bandwidth 
of the network. The security audit service is somewhat orthogonal to all the 
security services described above in that it is not directly involved in the prevention 
of security violation but assists in their detection. 


li 11 conipuler 
wluch oiinuot 
rti$pond with 
A(;K to 
SERVER 



Legitimate clients 
can no longer open 
TCP/IP connections 
with SERVER 


Security Management 

Key to provision of a security service is its management. A network security 
architecture needs to support the management of these services and how changes 
in policy and its enforcement can take place. For instance, in the case of 
confidentiality and integrity services, it is necessary to manage the keys used in 
the encryption and decryption processes. In the case of access control service, we 
need to manage the access control information such as access control lists and the 
access rules. Similarly in the case of authentication, authentication information, 
e.g., passwords and keys, needs to be managed. In the case of auditing, the 
management of audit trails and audit analysis is necessary. 

In networked systems, it is likely that there is no single authority that controls 
the entire environment. For instance, in an organization there may be severely 


61 



managers responsible for a subset of users, objects and operations. This does not 
mean that it is not possible to control security in a distributed environment 
centrally. However, even central security authorities end up trusting that the 
authorities responsible for local systems have implemented appropriate security. 
There may be several authorities performing different aspects of these security 
management functions: access control authorities, authentication authorities, key 
management authorities and audit management authorities. In practice several of 
these fimctions may be handled by a single authority. 

Organisational Decisions to be made 

The following decisions have to be made by the organization for designing and 
implementing in security policy: 

• The team of people responsible for the creation and management of the 
security policy. 

• Measures to be taken if a breach of the security policy happens both for 
external as well as internal breaches. 

• Measures to be taken if an internal user breaches the security policy of an 
external organization. 

Guidelines to the Design of a Security Policy 

1) Identify the team of people who will be responsible for the creation and 
management of the Security Policy. 

2) Perform a complete audit of every information resource. This has to be a 
structured audit designed for finding security weaknesses. 

3) Use the advice of a security expert to help understand the relationships 
between users, network resources, the O/S and applications. 

4) Identify possible enemies, their targets, and the weapons can use. 

5) Set priorities concentrating on the most likely invasion routes. 

6) Design a security policy which is as simple as possible. 

7) Never give more access than what is absolutely required. 

8) Use the strongest authentication mechanisms which are feasible. 

9) Implement the first version of the security policy. 

10) Interpret the security policy in common terms and raise awareness. 

11) Audit the policy. 

12) Modify the policy as required. 


62 



A Security Policy hence defines in detail an organization's stance on security 
and explicitly defines the reaction of the organization to a security breach. It must 
also define the person responsible, who is contacted if a security breach occurs and 
the actions to be carried out by this person. In addition, the policy must define the 
reaction of the organization if internal users violate the security measures of an 
external organization. 

In summary there is no security measure, which "Plug and Play" rather it is 
always a situation of "Plug and Pray". The design of a security measure for an 
organization requires a complete understanding of the information environment 
from a security point of view as well as the understanding of key technologies to 
be able to secure this environment. 


Defining the Scope of Access Control Security 

This white paper deals with access control security from the point of view of 
securing an organizational network against access. It does not deal with physical 
access control like entry restriction mechanisms. This document also expands the 
definition of access control to cover authentication techniques. 

Access Control devices are, often, the first lines of defence for corporate 
networks. They can be based on numerous and radically different technologies but 
are all aimed at solving the problem of restricting access to network resources. 
These products selectively permit or deny access to network resources based on 
a number of criteria which may include IP addresses and port numbers, login 
names and passwords, digital certificates, or phone numbers. 

The deployment of an access control mechanism involves the deployment of 
technologies to perform one or more of the above functions. The design of an 
access control mechanism involves the identification of entry and exit points to the 
network, the nature of traffic flowing through these points and the identification 
of suitable technologies to secure this traffic. 

Each access control device is good for performing a specific task. It is hence not 
a very good idea to just plug in one device and believe that the network is secure. 
The preferred method of deployment is to identify the specific needs of who needs 
to access what and how and then deploy the technology/technologies best suited 
to perform the task in hand. It also must be remembered that access control 


63 



devices only try to restrict access and provide authentication. If a particular user 
is given access to a network resource then the access control device's role ends. 
How secure the end device now is, is based on the authorization mechanisms 
deployed on that device. 


Access Control can be deployed at various points in a network 



Transaction and automation servers 
use strong password mechanisms for 
access control. 


It is also a good idea to have multiple perimeters of security, with access being 
restricted at each perimeter. This can be done by deploying multiple firewalls or 
deploying access lists on existing routers. One basic principle of access control is 
do not give more access than what is absolutely required. 

TCP/IP 

The TCP/IP protocol suite forms the base component for all traffic on the Internet. 
TCP/IP however was never designed with security in mind as a result of which 
almost every service running on TCP/IP has been proven to exhibit some level of 


64 


vulnerability. This is good because most of the vulnerabilities are known and 
there are solutions to fix them. This is bad because unless the organization has a 
security awareness these holes remain open and are huge security risks. 

Any TCP/IP connection between 2 hosts is characterized by the following 
parameters: 

The source address and the source application port (TCP or UDP). 

The destination address and the destination application port (TCP or UDP) 

TCP the Transmission Control Protocol and UDP the Unreliable Datagram 
Protocol form the 2 transport level protocols in the TCP/IP stack. Applications use 
their TCP or UDP for all communications. TCP is connection oriented and ensures 
reliable delivery. UDP is connectionless and is a much lighter protocol than TCP. 
Applications that use TCP 
include FTP, Telnet, HTTP, 

Rlogin, NNTP, SMTP, NET 
etc. 

There are increasing 
class of important appli¬ 
cations which use UDP 
most of which are for the 
distribution of multimedia. 

Applications are 
identified by TCP/UDP port numbers. The port abstraction, in very simple terms, 
is an identifier used by the OS while the application is running on to uniquely 
identify the instance of the process. For e.g. 

A machine 192.9.200.1 could be running 2 ftp client applications one to 
Netscape.com and the other to cisco.com. These 2 client applications 
need be uniquely identified inside the system so that data coming through 
from the TCP/IP stack is passed to the correct instance of FTP. This level 
of identification is done using unique TCP port numbers for each instance 
of client FTP. 

Most applications exist in 2 forms; client and server. The client application is 
almost always the initiator of the connection, while the server application is 



65 




almost always in listen mode. Since server applications need to be listening, they 
need to listen at fixed ports. Most server applications do run at fixed pre-assigned 
ports to which a client application connects by default without the need to pass 
the client any more information than the IP address of the server. Client applications 
are normally allotted ports dynamically by the OS for each instance of their 
execution. These ports are normally greater than 1024. 

In the TCP/IP world you have 2 classes of port the distinction between which 
is increasingly getting blurred. Ports less than 1024 are called privileged ports. In 
order to run an application at a port lesser than 1024 you require root or super user 
permissions at the OS level. While most servers do run at pre-assigned ports there 
is no restriction on the port at which you run a server, provided the client knows 
that it needs to reach that port. For e.g., you can run at http server at port 8080 and 
if a client needs to access this server it would specify http: / /your.http.server:8080. 


Connection Establishment in TCP/IP: The TCP 3-way Handshake Process 

TCP/IP uses a sequence number to indicate the position of data in the sender's 
byte stream and an acknowledgement number to indicate the number of the octet 
the source expects to receive next. TCP/IP also uses the SYN, ACK, RST and FIN 
bits of the code field to indicate how to interpret the fields in the header. 


The client decides to use 
an Initial Sequence 
Number of 1000. Since it 
is the first packet of the 
session the SYN flag is 
set. The ACK flag is not 
set. 

The server replies with a 
packet with the server’s 
ISN as 2000 and the 
Acknowledgement 
number as 1001. The 
server has the SYN and 
the ACK flag set. The 
server is now in a half¬ 
open condition. 


Client 192.9.200.1 

Active Open 

Connection 

Established 

Client close 

Connection 

Closed 


SYNdOOO) . 

SYN (2000), ACK(IQQI) 

ACK(2Q0I) 

ACK DAT/V 
ACK(23QQ) UN (15QQ) 

_ ACKdSOl) 

ACK (1501) FIN (2400) 

I .,. ACK (2401) _ 


Server 200.1.2.3 

Passive Open 

Connection 

Established 


Server close 

Connection 

Closed 


66 












The client has to reply to the server with a packet, which has Sequence number 
1001, acknowledgement number 2001 and SYN flag set to 0, and Ack flag set to 
1. Only after receiving this packet will the TCP/IP implementation on the server 
go from 1/2 open to a full open or connection established state. 

FIN flags are used to tear down TCP/IP connections. RST is used to reset 
connections. 

Some of the attacks based on vulnerabilities in the TCP/IP protocol stack: 

Attacks based on the prediction of TCP initial sequence numbers. 

An attacker can use sequence number prediction to construct a TCP packet 
sequence without ever receiving any responses from the server. This allows him 
to spoof a trusted host on a local network. 

Exploiting the TCP/IP-3 way handshake mechanism to generate denial of 
service attacks. 

During the second stage of the TCP handshake process the server has received 
a SYN packet from the client and has responded with a packet with SYN & ACK 
set. The server's TCP/IP stack at this moment is in a half open state. Any O/S has 
a small finite number of allowable half-open TCP/IP connections. If this number 
is exceeded then the server cannot accept any further TCP/IP connections till this 
number drops below the threshold. This can be exploited to generate a denial of 
service attack. The attack takes place as follows: 

1) The attacker sends a SYN packet to the server using a spoofed IP address 
as the source. This address can be that of a machine the attacker knows for 
sure cannot respond to the server or any of the RFC 1597 specified addresses 
reserved for use on private address space or O.O.O.O or 255.255.255.255. 

2) The server responds with a SYN-ACK to this packet to the source address 
above. 

3) The source address does not reply with an Ack. 

4) The server now has an half open TCP/IP connection. 


67 



5) The 0/S allows only a finite amount of 1/2 open TCP/IP connections. 

6) The attacker continues to pump the server with SYN packets. 

7) The server exhausts its quota of 1/2 open cormections and cannot accept 
any more connection requests. 

8) At this point various things can occur even a system crash. The simplest 
thing, which can occur, is that the system can no longer accept legitimate 
TCP/IP connections. 


An example SYN‘Atiack used for Denial of Service 


Attacker First Masquerades as B then 

Sends SYN packets to SERVER 


SYN 



B is a computer 
which cannot 
respond with 
ACKto 
SERVER 


The number of 
1/2 open TCP/IP 
Cormections on 
I I^ERVER increase. 



Legitimate clients 
can no longei* open 
TCP/IP cormections 
with SERVER 


Exploiting the source-route options 

The source route option (either strict or loose) can be used to force traffic along 
a certain path. Most normal applications never use the source route option. This 
is used more for network testing purposes along with record route etc. 

Applications using UDP 

UDP presents its own set of security related issues. UDP is essentially 
stateless and does not have any connection establishment handshake mechanism. 
Hence, there is no way of authenticating the sender of an UDP packet. Hence, 
UDP applications need provide some form of authentication service or the 


68 



firewalls used have to have the ability to maintain state information for UDP 
connections. 

Denial of service attacks using ICMP 

One of the newer examples is commonly referred to as the ping of death. 
This is the use of extra large ping packets to overwhelm the TCP/IP stack on a 
host. Other methods of denial of service using ICMP is to advertise false ICMP 
messages saying that a particular host is not reachable/reachable via an alternate 
path. 

Next we shall look at the security weaknesses of the daemons and applications 
which use TCP/IP. 

SMTP 

SMTP stands for the simple mail transfer protocol. E-mail, according to most 
system administrators happens to be the most sought after service on the network. 
SMTP has some very serious security limitations, some of which have been fixed. 
One of the most apparent problems is that there is no way in SMTP to verify the 
source of the message. The SMTP daemon does not bother about checking the 
domain or user name that the message is supposed to have come from. This 
problem is now alleviated by the use of signed mail messages (signed with digital 
certificates). The other much more famous weakness with SMTP involved the bug 
used by the Internet Worm (November 2-5, 1988). This bug has been fixed since 
then on almost all implementations of SMTP. This bug consists of the ability to 
send the sendmail daemon into debug mode. The sendmail daemon runs as root. 
Once in the debug root it is possible to execute Unix line commands. One of the 
commands could be : rm-rf/&. 

Telnet 

The service on UNIX which allows users to remotely login to different 
machine. IP does not support encryption and hence telnet login names and 
passwords travel in plain text. This can be easily picked up by an attacker with 
a network analyzer. Telnet is also used to mount other attacks like that on 
sendmail. 


69 



Finger 

Finger services are used to provide users with information about other users. 
Hence, using a finger on serverl.acme.com will give an attacker a list of users on 
this system. He can then launch a dictionary attack to guess their passwords. 

RPC-NIS 

The RPC or Remote Procedure Calls was developed by Sun Microsystems. 
RPC allows for the replacement of many TCP/IP tools with easier to use RCP 
tools. RPC does support DBS in its secure implementation. One of the services, 
which utilize RPC, is the NIS or Network Information Service. The command 
'rpcinfo' can give an attacker a lot of important information about a system such 
as what file system the system uses, what processes are running on it etc. 

The NIS also distributes information like password files, host address tables 
and public/private key databases for secure RPC. This information in the hands 
of an attacker can be catastrophic. 

The FTP family 

The file transfer protocol enables users to transfer files between computers. 
The basic FTP implementation is familiar to most users. The less familiar ones are 
the Trivial File Transfer Protocol (TFTP) and the FSP. FTP operates by the setting 
up of a command channel and a data channel between 2 machines. The command 
channel is opened by the originator. The data channel is opened by the destination 
server back to the originator. This itself is enough to throw packet filters whose 
filtering accounts for the direction of connection establishment into a spin. Like 
telnet FTP login names and passwords are also transmitted in plain text. The other 
problem with FTP is that anonymous FTP sites which allow read and write access 
can be misused for the distribution of illegal, pirated or pornographic software. 

TFTP uses UDP. It is commonly used to boot diskless workstations. The 
interesting thing about older TFTP implementations is the fact that that there is 
no restriction on the files that could be transferred. 

The third file transfer protocol is obscure and is similar to FTP except that it 
works over UDP. Also called the Sneaky File Transfer Protocol it is seldom used 
but for bad purposes. 

The Berkeley Remote Services 

The 't' service allows administrators to work on remote machines as if they 
were local. There are 3 criteria to be met. The call must originate on a privileged 


70 



port (and this cannot be enforced on a PC); the caller machine must be listed in 
the /etc/hosts.equiv file or the $HOME/.rhosts file which can be overwritten in 
numerous ways and the caller's name must correspond to its IP address, 
'rlogin' allows telnet access without password authentication 
'rsh' allows the user to mount a shell on another machine that 
thrusts the user and then execute shell commands, 
rexec which works similar to 'rsh' 

DNS Services 

Domain Name Services are used to serve name to IP address referrals on a 
network. They ^re designed around a distributed hierarchy of domain names. 
DNS servers can be compromised to provide false information and hence disrupt 
services. 

Security issues with the New Generation of Applications 

The newer generations of internet applications are designed to be secure. 
However, serious security issues with both Java and ActiveX have been noticed, 
these have been almost immediately fixed, but more weaknesses are sure to be 
found. 


In conclusion TCP/IP version 4 has a whole host of secu,rity 
weaknesses. The access control security mechanism must be 
designed to help minimize the effects of the security limitations. 

The Need for Address Translation 

One of the greatest challenges faced by the Internet community due to the 
exponential growth of the Internet is the fact that the globally unique address 
space will be exhausted. A separate and far more pressing constraint is the fact 
that the amount of routing overhead will grow beyond the capabilities of Internet 
Service Providers. To contain this growth ISPs obtain a block of addresses from an 
address registry and then assigns to its customer's addresses from within that 
book. This enables the aggregation of routes. 

However, with the limited address space available it is not realistic to assume 
that an organization will get enough IP address space to ensure enterprise wide 
connectivity. Hence, organizations have to use a set of "illegal" non-IANA assigned 
addresses or RFC1918 specified lANA assigned addresses for use in Private Address 
Space. 

The Internet Assigned Numbers Authority (lANA) has reserved the following 
addresses for use on private internets: 


71 



From 


To 


Comment 


10.0.0.0 10.255.255.255 24 bits block 

172.16.0.0. 172.31.255.255 20 bit block 

192.168.0.0 192.168.255.255 16 bit block 

Just to reiterate an oft repeated concept it is essential for every host connected 
to the internet to have a unique IP address as its identifier. However, it is just a 
little subtler. It is NOT hosts, which connect to the Internet; it is actually applications 
running on hosts, which connect, to the Internet. 

Hence, the basic requirement is every application connecting to the internet 
must haive an unique identifier comprising of the IP address of the host it is 
rxinning on and the port it uses. 

This is the basis of all address translation. 

In an organization connecting to the Internet there are 2 types of hosts. Hosts, 
which provide services that can be reached from the internet (public domain 
servers), and hosts which are only clients i.e. use services off the internet. Address 
translation requires each to be deailt with separately. Public Domain Server hosts 
need to be advertised to the Internet with fixed unique IP addresses and Port 
Numbers. Hence, these hosts need to be translated to legal addresses using static 


Public domain server 
internal IP address : 172.16.10.1 


port number 80 


,'t 


d 


Internal addresses 



Client initiates coimection to 
wwwjietscape.com from 
IP address 172.16.10.51 
port number 1256 



202.54,13.172 


Legal addresses 


72 





The address translation table on the NAT device would show the following; 


Connect 

Source 

■■ 

Translated 

Translated 

Destination 

Destination 

direction 

Address 

B 

Address 

Port 

Address 

Port 

From Client 

172.16.10.51 

1256 

202.54.13.172 

■■1 

www.netscape. 





source 

source 

com 

■■ 




trqnslated 

translated 


■■ 

Reply from 

www.nelscape 



1256 

202.54.13.172 


www.netsca 

com 

■B 

destn 

destn 


■■ 

pe.com 



translated 

translated 



To PD server 

202.54,1.30 

8765 

172.16.10.1 


20254.13.172 

3076 




static dest 







translation 




From 

172.16.10.1 

80 

202.54.13.182 


202.54.1.30 

8765 

PDserver 



static source 







translation 





Address translation is also used as a 
security mechanism to hide the 
addresses of internal hosts. 

Implenting Access Control using Firewalls 
What is a firewall? 

Firewall noun: A fireproof wall used as a 
barrier to prevent the spread of a fire 
- American Heritage Dictionary 

address translation. This is a one to one 
mapping. Hence, for each of these servers 
there has to be a unique Legal Address 
assigned. 

Client hosts on the other hand can initiate connections from any port. Hence, 
numerous client hosts can share the same legal IP address through an Address 
Translation device, which will multiplex clients on ports. The diagram below 
details Address Translation. 



73 

































A firewall implementation must 

have the following properties: 

1. All traffic in and out must pass 
through the firewall 

2. Only authorised traffic is allowed 
to pass 

3. The firewall itself is immune to 
penetration 

- Firewalls: The actual definition 


Situating Firewalls. 



Firewalls can be implemented 

• On packet filtering routes 

• As Circuit level gateways 

• On Application level gateways 

• Using SMLI device 

Firewalls are normally implemented as gateways with 2 or more NICs. All 
traffic in and out of a network is forced to pass through the firewall. The organization 
security policy is defined on the firewall in terms of rules that define access 
restrictions. 


Packet Filtering 
Firewalls 

Packet Filtering 
Firewalls are based on 
routes or computers 
running software 
configured to screen 
incoming and outgoing 
packets. The decision to 
accept or deny packets 
is based on information 
contained in the TCP 
and IP headers of the 
packet. Packet filters do 


Packet Filters Examine the contents of IP 
and TCP headers to make decisions 



74 




not understand applications and do not make decisions at the application layer. 
Most packet filters make decisions based on the following: 

• Source address 

• Destination address 

• Application or protocol 

• Source port number 

• Direction of establishment of connection 

Packet Filters maintain a table of rules that dictate what t57pe of packets should 
be allowed or denied. 

The packet filter scans these rules till it finds one that agrees with the 
information in a packet-full association. If no rule is foimd it uses a default rule. 

These rules are commonly called access lists. 

The process of configuring access lists is not easy. You can use access lists to 
allow or deny services based on TCP and UDP port numbers. However, the lack 
of application understanding packet filters implies that they are not very well 
suited to handling applications which open reverse connections (e.g. FTP) in a 
secure manner. Packet filters are also susceptible to spoofing attacks. 

Packet filters are inherently low cost security solutions (you need a router to 
connect in any case!) but are not complete solutions. It is a good idea to implement 
access lists on a router as an additional wall of defence but to rely on a firewall 
for better security. 

Circuit Level Gateways 

Circuit level gateways monitor TCP handshaking between packets from trusted 
clients or servers to untrusted clients or servers and vice versa to determine if a 
given session is legitimate. Circuit level gateways continually monitor TCP 
handshaking. A trusted client requests a service and the gateway accepts this 
request, provided of course that the client meets the trusted criteria of the gateways. 
Next acting on behalf of the client the gateway opens a connection with tiie 


75 



untrusted host and then closely monitors that TCP handshaking that follows. A 
circuit level gateway determines that a session is legitimate only if the SYN and 
ACK flags as well as the sequence numbers involved are logical. After this the 
gateway simply copies and passes packet between trusted and untrusted hosts 
without further filtering. In order to be able to achieve this credit level gateways 
use generic proxy services. Generic proxies are applications that can perform this 
copy operation and establish a virtual turmel through the gateway. Circuit level 
gateways are seldom used on their own. They are bundled with application level 
gateways to increase the functionality of the latter. 

Circuit level gateways perform dynamic address translation by default. All 
packets on the untrusted network emerge with the IP address of the external 
interface of the proxy. Circuit level gateways also provide automatic protection 
against spoofing attacks by hiding all internal IP addresses. 

Application Level Gateways 

Application level gateways are functionally very similar to circuit level gateways 
in the facet that they intercept incoming and outgoing packets; run proxies to copy 
and forward information and prevent any direct cormection between trusted and 
untrusted hosts. However the proxies used by an application level gateways 
function at the application layer. These proxies are inherently application specific 
hence only a www proxy can accept copy and forward a www connection. 
Networks protected by application level gateways cannot access services for 
which there is no proxy on the gateway. These proxies examine and filter every 
packet at the application layer and can hence filter particular types of commands 
or application level information. For e.g.. You can restrict FTP operations to FTP 
get only and not FTP put. Application level gateways are traditionally strong on 
logging application level events. 

Application level gateways are among the most secure firewalling mechanisms 
available but are plagued by a few drawbacks. Chief amongst them is the lack of 
transparency. This causes users to go through multiple levels of login before they 
can access resources. Application level proxies also need modification to client end 
programs in order to redirect all requests through the proxy. While some applications 
like Telnet, smtp and www are easily proxied it is not the same for all applications. 
One of the methods of overcoming the transparency problem is to deploy SOCKS 


76 



at the client end. The SOCKS protocol provides transparent authentication services 
for clients requesting connections to devices through firewalls. The other drawback 
of application level gateways is a lack of flexibility in terms of applications services 
offered. 

Stateful Inspection Firewalls 

Statefull inspection firewalls combine aspects of packet filters, circuit level gateways 
and application level gateways. The statefull inspection firewall at the network 
level filters all incoming and outgoing packets based on source and destination IP 
addresses and ports. At the session level it determines whether the packets in a 
session are appropriate. Finally, it maintains application level understanding of 
the packets. Statefull firewalls unlike application level gateways do not require the 
presence of application level proxies at the gateway. Rather they rely on powerful 
inspection algorithms, which recognize and process application level information. 

SMLI or stateful multilayer inspection, which was developed by Check Point 
Software technologies, works on the principle of maintaining state information at 
various layers of the communication stack. The firewall module intercepts incoming 



SMLI devices maintain state information across multiple layers 


77 



packets before they are handled by the OS's TCP/IP stack. These firewalls have 
access to complete network and session state information, but that is not all. In 
addition, the powerful application state, algorithms enable them to maintain 
application state. SMLI firewalls also have incredible flexibility in the handling of 
UDP, which is a stateless protocol and generally acknowledged as a hard protocol 
to secure. This is achieved by maintaining state information for the entire UDP 
session. 

SMLI firewalls maintain rule base tables that define the organization security 
stance to be implemented by the firewall. SMLI firewalls do allow direct cormections 
between trusted and untrusted hosts. This according to some schools of thought 
is a security risk. This is however not true as the firewall maintains complete state 
information and is aware of the cormection at all times. SMLI firewalls provide 
much more flexibility and user transparency than any of the other categories of 
firewalls without compromising on security and at the same time offering higher 
performance. 

Amongst the products offered in the market, with the exception of Checkpoint 
Firewall-1 and the On Guard product (both are stateful inspection products) most 
other firewalls are hybrid firewalls. They combine application level gateways and 
circuit level gateways to provide both security and flexibility. 

Any firewall must offer the following functionality: 

• Ease of use and manageability 

• Logging and alerting mechanisms 

• Default Deny rules and the ability to stop IP forwarding if the firewall is not 
running 

• Anti spoofing mechanisms 

• Defence against denial of service attacks 

• Strong authentication mechanisms 

• Encryption mechanisms 

In conclusion, firewalls form the defence perimeter for a network. All traffic 
in and out of the network must pass through the firewall in order to be secured. 
There is, however, no firewall mechanism which can claim to be 100 percent 
secure. Still any modern firewall is a very tough device to break through. The 
process of deploying a firewall is not just setting up a plug-and-play box. It 
involves the creation of very specific rules to permit or deny access. Many hackers 


78 



try to exploit the weaknesses in the OS the firewall is running on. Hence, many 
firewalls run either on specific "hardened" Operating Systems or provide other 
mechanisms to protect the OS. 

Finally, it is not a very good idea to run any other software on a firewall 
machine. Firewalls should be specific and not general purpose machines. 

Authentication Mechanisms 

While firewalls address the issue of access control to network resources, the ability 
to identify users is that of a good authentication mechanism. Most network 
operating systems provide the users with login names and passwords for system 
access. They also define various authorization levels for each user. One of the 
commonest breaches of security happens when an intruder discovers a legitimate 
user's password for a system. The defence against this is to use a strong 
authentication mechanism. 

The authentication mechanisms available range from simple login names and 
passwords through digital signatures to biometrics verification techniques. The 
choice of an appropriate authentication mechanism is defined by the security 
requirement of the site . 

The most common authentication mechanism is based on login names and 
passwords. These are susceptible at various points: 

1) The names and password database on the authentication server can be 
compromised if stored in plain text. 

2) The names and passwords may be snooped on whilst in transit if they are 
transmitted in clear text as in the case of Telnet and FTP. 

3) The password associated with a particular user login name may be guessed 
in a variety of ways. 

It is not a good idea to use the names of persons or things that can be easily 
associated with the user. Further the use of English words as passwords is 
susceptible to what are known as dictionary attacks. 

Recommended strong password mechanisms indicate the use of passwords 
that have a mix of alphanumeric characters. However, passwords have to be 
remembered for use by users. Having an extremely strong 20 character password 
is of no use if the only way an user can use it is by writing it down and sticking 
it to the top of his monitor. Good techniques to be used are the first letters of the 
words of a song etc. 


79 



However, multiple use passwords are still a security risk, since if the password 
is compromised it can always be reused. The solution to this is what is known as 
one time passwords. 

The S-Key technique of one time passwords involves the generation of 
strongs of random English words. Each string forms a password and can be used 
only once. This is a very strong password mechanism but is an administrative 
nightmare to maintain the passwords generated. 

The other technique is to use smart cards like the ones provided by Secure- 
ro. These solutions use a client side card with a random number generator. This 
random number generator and the PIN of the card uniquely identify the card. 
Moreover, the number changes very minute. Hence, these cards are used to 
maintain very high levels of authentication security. 

Digital signatures and digital' ids attempt to prove authenticity through the 
use of trusted third parties or certification authorities. These certificates cannot be 
tampered (they are signed using the private key of the CA) and serve as proof of 
ownership. 

At the highest end comes various biometrics verification techniques which 
use a variety of parameters like body resistance, comeal patterns, fingerprints etc. 

Summary 

The Moral of this story is, anything you do not understand is 
dangerous until you do imderstand it. 

Beowulf Schaefer in Flatlander 
Larry Niven 

While there is no access control mechanisms which is one hundred percent 
secure, always remember "There be dragons". Access Control Devices provide 
the first perimeter of security to an organization. They are often the highest 
priority of deployment areas of a security policy. They are based on numerous 
technologies. The definition of a good access control policy document goes a long 
way in implementing enterprise wide security. The strength of an access control 
mechanism is in its definition hence always use explicit rules. Strong authentication 
mechanisms are a logical extension of a good access control policy. Access Control 
is followed by transaction control to complete the security stance of an organization. 


80 



National Information Infrastructure (Nil): 

Issues and Plans 

S. RAMAKRISHNAN 
Director, DOE, New Delhi 

ABSTRACT 

It is almost a quarter century since an early version of Internet called Arpanet was 
first demonstrated in 1972. Internet has come of age in this period and has 
captured peoples' information exchanges, storage and retrieval. It is perhaps the 
first broad based demonstration of a universal service based on computer network. 

However, today's internet is viewed by many as a primitive one compared to 
the Information Infrastructure of tomorrow, often referred to as National 
Information Infrastructure (Nil) and Global Information Infrastructure (GII). They 
represent the concept of information generation, storage and retrieval over a 
computer networking infrastructure which while retaining the positive aspects of 
today's Internet such as low cost access, seamless connectivity, hyper-text based 
information retrieval, etc dramatically improve areas considered inadequate in 
today's Internet such as high bandwidth to support multimedia applications, 
higher security as privacy; application like electronic commerce with friendly user 
interfaced tools, higher quality information sources such as digital libraries etc. 

While today's internet is considered, as a good starting point on which 
tomorrow's NII/GII can be launched, one area of critical importance is a viable 
business model. While private Govemment/public sector is expected to serve the 
interest of important sections of national relevance such as education service/ 
subsidy provision or by policy intervention to ensure that NII/GII gets universal 
reach. 

Various programmes like ERNET, NICNET, and the one offered by VSNL have 
brought about a sea-change Internet usage in the country. Obviously, they have 
just scratched the surface in relation to the potential. Discussion on NH/GII may 
have also begun in forums like MAIT, DOE, etc to help them crystallize basic plans 
in conjunction with various user groups. It is important for different agencies to 
articulate their needs. This paper would provide the backgroimd for detailed 
discussion among the participants on the matter. 


81 



Network Technology for Multimedia 
Information Dissemination 

T. VISHWANATHAN, 

Director, INSCXZ, New Delhi 

ABSTRACT 

Much as the way one witnessed developments in the field of microprocessors in 
the late 70s and in the field of parallel processing in the late 80s, the mid 90s is 
witnessing large scale induction of multimedia in every walk of life. In tune with 
this, the information seeking behaviour of persons is also changing with the users 
demanding multimedia information. 

Multimedia information communication calls for ability to transfer 
simultaneously soimd, video, text, graphics, pictures, images, movie clips etc. We 
need newer technologies, techniques, and network management philosophy in 
order to efficiently implement multimedia information commimication. 

Three technologies are fast evolving and competing with each other: Terrestrial, 
Radio and Satellite technologies. A number of new techniques such as ATM and 
Frame Relay are being proposed for making efficient use of the media and 
bandwidth. A variety of new services are being offered on the network using 
common digital base, which may be useful for multimedia information 
communication. In this context, broadband ISDN development appears to be 
important. 

A lot needs to be done in the area of regulatory issues as they appear to be a 
major stumbling block in the growth of worldwide multimedia information 
communication. Problems related to transborder data flow, cultural impacts an 
political fall outs are some issues that need to be taken into account while 
formulating regulations. The success of networking would depend on our ability 
to efficiently operate and manage the network. Remote diagnostics, effective 
billing engines, configuration management an scheduling are some important 
aspects of network management that need to be addressed in the context of 
multimedia information communication. 


82 




